Tuesday, August 29, 2006

More on Regular Expressions
Blog Entry #49

In the previous entry, we developed several regular expressions for vetting a user's response (or nonresponse) to the "Enter First Name" field of the Primer #29 Script. We turn now to the script's "Enter Zip Code (99999-9999)" field:

<form name="dataentry">
Enter Zip Code (99999-9999):<br>
<input type="text" name="zip" size="10"><p>
<input type="button" value="Submit" onclick="validZip(zip.value);"></form>

You may recall that the original validZip( ) function tested:
(a) if the user had entered 5 or 10 characters into the zip field; and
(b) if the user's first five inputted characters were numeric digits.
If we wanted to, we could again address these issues separately with two different regular expressions; however, it's more efficient to use a single regular expression that will match either a 99999 ZIP code or a 99999-9999 ZIP+4 code:

function validZip(zp) {
var zc = /^\d{5}\$|^\d{5}-\d{4}\$/;
if (zp.search(zc) == -1) {
document.dataentry.zip.focus( ); } }

Let's look at the regular expression zc, which is really two regexp patterns rolled into one.
• \d matches any numeric digit and is equivalent to [0-9] (a hyphen can be used to span a number range as well as a letter range).
• {5} is a quantifier that matches \d exactly 5 times; ^\d{5}\$ thus matches a string comprising 5 digits and 5 digits only. One might expect the opening brace { and the closing brace } to be metacharacters; somewhat curiously, they are not. The ^ and \$ metacharacters were covered in the previous post.
• The vertical bar (|) is a metacharacter that serves as a logical OR statement (recall JavaScript's || logical OR operator, which briefly cropped up in Blog Entry #37); zc will match either the subpattern preceding the | or the subpattern following the |.
• ^\d{5}-\d{4}\$ matches a string comprising 5 digits followed by a hyphen followed by another 4 digits.

In sum, the ^\d{5}\$ subpattern will match a 99999 ZIP code and the ^\d{5}-\d{4}\$ subpattern will match a 99999-9999 ZIP+4 code, and the | metacharacter between the subpatterns allows zc to match either ZIP code format.

We then search( ) zp, the value (user input) of the "Enter Zip Code (99999-9999)" field, for an occurrence of a zc subpattern, and compare the return with -1 in the condition of the subsequent if statement:

if (zp.search(zc) == -1)

If in any way the user's input does not conform with either of zc's subpatterns, then the if condition returns true; a "Please enter a proper ZIP code" alert( ) message pops up and focus is returned to the zip field.

It should be clear from the above that the following code can be used to separately address the 'first five characters must be numeric' issue:

var firstfive = /^\d{5}/;
if (zp.search(firstfive) == -1) {
document.dataentry.zip.focus( ); }

Can you write a regexp pattern that tests if a user's zip input consists of 5 or 10 not-necessarily-digit characters?

The test( ) method

It's high time that we broke out of our search( ) method rut, wouldn't you say? An equally user-friendly method for comparing a regexp pattern with a target string is the test( ) method of the RegExp object:

regexp_name.test("some_string");
// returns true if regexp_name matches "some_string" and false if it doesn't

Here's code we can use to test( ) zc vs. zp:

if (!zc.test(zp)) {
document.dataentry.zip.focus( ); }

We discussed the ! logical NOT operator in Blog Entry #47. If neither of zc's subpatterns matches zp, then zc.test(zp) returns false and thus the !zc.test(zp) condition of the if statement returns true; as before, a "Please enter a proper ZIP code" alert( ) message pops up and focus is returned to the zip field.

As long as we're talking about regular expressions, let's look at the two HTML Goodies articles in which they briefly crop up.

Social Security number validation

The following code for validating a Social Security number (SSN) appears in HTML Goodies' "Validating Special Numbers" article:

<script language="JavaScript1.2"><!--
function regular(string) {
if (string.search(/^[0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9][0-9][0-9]\$/) != -1)
return true;
else
return false; }
//-->
</script>
The user enters an SSN into the following field:
<input type="text" size="12" maxlength="12"

(The <script> tag contains an historical note of sorts; the first version of JavaScript with support for regular expressions was in fact JavaScript 1.2.)

The approach here is more convoluted than it needs to be. The call for the validating regular( ) function is set in the condition of an if statement assigned to an onChange event handler, which 'fires' after the user enters an SSN and then blurs the input field. Note that the regular( ) function call is preceded by the ! operator; consequently, if regular( ) returns false to the if statement, whose condition would then return true, then a 'Not Valid' alert( ) message pops up. (No commands execute if regular( ) returns true.)

When regular( ) is triggered, the user's input (this.value) is assigned to the identifier string (string is not a JavaScript reserved word, in case you were wondering); an if statement then uses the search( ) method to compare string with the following regexp pattern:

/^[0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9][0-9][0-9]\$/

Equivalently and more simply, we can rewrite the pattern as:

var ssn = /^\d{3}-\d{2}-\d{4}\$/;

Unless it is spanning a letter or number range inside of square brackets, a hyphen is not a regexp metacharacter and does not need to be escaped with a backslash.

If the user's input conforms to the 999-99-9999 SSN format, then string.search(ssn) will return 0, which is not equal (!=) to -1; the if condition returns true and regular( ) itself returns true. If the user's input deviates from the 999-99-9999 format, then string.search(ssn) returns -1, the if condition returns false, regular( ) returns false via its else statement, and the 'Not Valid' alert( ) pops up.

Here's what I would have done:

<script type="text/javascript">
function regular(str) {
var ssn = /^\d{3}-\d{2}-\d{4}\$/;
if (str.search(ssn) != -1) window.alert("Thank you.");
</script>
<input onchange="regular(this.value);">