Home Products Collaborative Apps Examples and Ideas About Us Feedback



By Marco Bellinaso (

Here is how we can define the textbox and the validators for asking the user to input an e-mail address, from an ASP.NET page:

<asp:TextBox runat="server" CssClass="TextBox" ID="NewEmail" Width="100%" /> <asp:RequiredFieldValidator runat="server" ControlToValidate="NewEmail"
Display="dynamic"><br>* Email is required </asp:RequiredFieldValidator> <asp:RegularExpressionValidator runat="server"
ValidationExpression=".*@.*\..*" ControlToValidate="NewEmail"
Display="dynamic"><br>* This Email address is not valid </asp:RegularExpressionValidator>

The expression .*@.*\..* means that the string must begin with a number of characters (.*), then it must contain a '@' character, some more characters, a period (escaped as \.), and finally more characters. For example, is a valid e-mail address, while marco@thephile and are invalid addresses.

The following tables summarize the most often used syntax constructs for the regular expressions. First of all, let's see how to express the characters that we want to match:

ordinary characters characters other than .$^{[(|)*+?\ match themselves
\b matches a backspace
\t matches a tab
\r matches a carriage return
\v matches a vertical tab
\f matches a form feed
\n matches a newline
\ if followed by a non-ordinary character (one of those listed in the first row) matches that character. For example \+ matches a + character

In addition to single characters, we can specify a class or a range of characters that can be matched in the expression. That is to say that we could want to allow any digit or any vowel in a position, and exclude all the other characters. The following character classes allow you to do this:

. matches any character except \n
[aeiou] matches any single character specified in the set
[^aeiou] matches any character not specified in the set
[3-7a-dA-D] matches any character specified in the specified ranges (in the example the ranges are 3-7, a-d, A-D)
\w matches any word character, that is any alphanumeric character or the underscore (_)
\W matches any non-word character
\s matches any whitespace character (space, tab, form-feed, new line, carriage return or vertical feed)
\S matches any non-whitespace character
\d matches any decimal character
\D matches any non-decimal character

Also, we can specify that a certain character or class of characters must be present at least one, or between 2 and 6 times, etc. The quantifiers are put just after a character or a class of characters, and allow you to specify how many times the preceding character/class must be matched:

Quantifier Description
* zero or more matches
+ one or more matches
? zero or one matches
{N} N matches
{N,} N or more matches
{N,M} between N and M matches

To recap everything with another easy example, say that we have the expression [aeiou]{2,4}\+[1-5]*: this means that a string to correctly match this expression must start with two to four vowels, have a + sign, and terminate with zero or more digits between 1 and 5.

Regular expressions are a very powerful tool to validate the content of a control because they can be very detailed and complex. Furthermore, you can use them to do other advanced work, such as replacing or extracting the occurrences that match the expression, as we'll see in practice in the next chapter. Entire books have been written to teach how to use regular expressions, for example, "Sams Teach Yourself Regular Expressions in 24 Hours" (Sams Press, ISBN 0-672319-36-5).

You have to be aware, however, that the regular expression used to validate the e-mail address in the module only checks that the address is well formed. One could still subscribe with an address that does not exist, there's nothing that prevents that at this time. To limit this you can at least improve the regular expression with other rules, such as checking that the domain name is at least two characters long and that it does exist, the extension is not an extension of fantasy (the supported extensions are limited, consult to get a complete list), etc. There are yet other rules, and although we kept it simple in our example, you'll find that a complete regular expression could be much longer that ours. If regular expressions are not enough for you (and they are not if you want to check the existence of a domain, for example), you can use a CustomValidator and write your own function to validate a value.

However, even with the most complete expression and other methods you can't be 100% sure that the address exists, the user can write an address with a real domain name, a real extension, but with a name of fantasy before the @. When the messages are sent out, you won't get any error or exception at that time, the SMTP server does its work without letting you know about the result. However, messages sent to non-existent addresses usually come back to the sender with an error message saying that the message couldn't be successfully delivered because the address does not exist. These error messages are sent to the server's postmaster and then forwarded to the site's administrator. At this point, when you get such a message, you can manually remove the address from the DB, or you could even write a program that parses the incoming messages to find the error messages, and automatically delete the non-existent e-mail addresses by using the business classes of the MailingList module. This is beyond the scope of this book of course, and it is usually only provided by professional modules that must be able to handle tens or hundreds of thousands of e-mails, and automate any possible process. For most sites, though, our module is enough, and deleting the erroneous address from the DB once a month or so is not such a big issue (how often you should do this actually depends on the frequency of your newsletters).

Copyright 2003-2007 JWG Solutions, All Rights Reserved.