Tuesday, September 19, 2006
Channeling Allen Ludden
Blog Entry #51
In recent entries, we've addressed various aspects of the validation of text field input. We turn our attention in this post to a related and complementary topic: the validation of user input into password fields. In a sense, validating a password requires an approach opposite to that for validating a name, ZIP code, etc.: for a password, you deliberately want the user to enter something weird, i.e., as difficult to crack as possible.
There is no universal standard for a strong password. A Google search for "password requirements" generates almost 100,000 hits from all types of organizations, each having its own password policy. For this entry, I arbitrarily chose as a benchmark the "Minimum Password Complexity Standards" recommended by the University of California at Berkeley, according to which a password MUST:
(1) contain eight characters or more; and
(2) contain characters from two of the following three character classes:
(a) letters;
(b) numbers;
(c) all other printable ASCII characters (! @ # $ % ^ & * ( ) _ + | ~ - = \ ` { } [ ] : " ; ' < > ? , . /).
Now then: how do we ensure that a user's password conforms to these guidelines?
(We will not discuss Berkeley's "The password MUST NOT be..." guidelines, which are not validatable as far as I am aware.)
Solution #1
<script type="text/javascript">
function validpw( ) {
var userpw = document.fpw.pw.value;
if (userpw.length < 8) {
window.alert("Your password must have at least 8 characters."); document.fpw.pw.focus( ); }
else {
var lett = /[a-z]/i;
var num = /\d/;
var nonalphanum = /[_\W]/;
if ( (lett.test(userpw) && num.test(userpw)) || (lett.test(userpw) && nonalphanum.test(userpw)) || (num.test(userpw) && nonalphanum.test(userpw)) )
window.alert("Thank you.");
else {
window.alert("Please choose a broader range of characters for your password."); document.fpw.pw.focus( ); } } }
</script>
<form name="fpw">
Enter your password, please:
<input type="password" name="pw"><p>
<input type="button" value="Submit" onclick="validpw( );"> <input type="Reset">
</form>
Comments
In the validpw( ) function, the user's password input is assigned to the identifier userpw. Subsequently, the script uses a simple comparison (à la the Primer #29 Script) to address userpw's ≥8-character length requirement.
The lett, num, and nonalphanum regular expressions correlate with the three character classes listed above. In complement to \w (defined in the previous post), \W matches any character that is not a letter, a number, or an underscore. \W will match a space character, but I can't think of any reason why a password shouldn't contain spaces.
Finally, I addressed the 2-out-of-3 character class requirement by stringing together a series of regexp_name.test(userpw) commands with the && and || logical operators.
Try it out below - type in a password that does or does not meet the Berkeley standards and then click the Submit button:
Solution #2
Naturally, I wondered, "Is there any way we can represent the Berkeley standards with a single regular expression?" Indeed there is:
<script type="text/javascript">
function validpw2( ) {
var userpw2 = document.fpw2.pw2.value;
var pw_regexp = /(?=^[\s\S]{8,}$)((?=[\s\S]*[a-z])(?=[\s\S]*\d)|(?=[\s\S]*[a-z])(?=[\s\S]*[_\W])|(?=[\s\S]*\d)(?=[\s\S]*[_\W]))/i;
if (pw_regexp.test(userpw2)) window.alert("Thank you.");
else {
window.alert("Your password does not meet our length and/or character requirements.");
document.fpw2.pw2.focus( ); } }
</script>
<form name="fpw2">
Enter your password, please:
<input type="password" name="pw2"><p>
<input type="button" value="Submit" onclick="validpw2( );"><input type="Reset">
</form>
The star of the script above is the regexp pattern pw_regexp, which makes extensive use of a "positive lookahead" construct having the following general syntax:
x(?=regexp_pattern)y
As explained by Regular-Expressions.info here and here, the browser:
(a) matches x if x is followed by a match of the regexp_pattern in the parentheses, but then
(b) discards the regexp_pattern part of the match, and
(c) returns to the dividing line between x and the character following x in attempting to match y. A series of positive lookaheads
x(?=regexpA)(?=regexpB)(?=regexpC)(?=regexpD)...
thus allows us to compare a series of regexp patterns with the same (sub)string, because with each matching lookahead the browser will return to the dividing line between x and the character following x.
Let's briefly look at the positive lookaheads that compose the pw_regexp pattern.
• (?=^[\s\S]{8,}$) matches any input of ≥8 characters; we learned in Blog Entry #48 that the [\s\S] pattern matches any single character. The {8,} quantifier format is discussed by Regular-Expressions.info in the "Limiting Repetition" section of this page. Nothing precedes (?=^[\s\S]{8,}$) in the pw_regexp pattern; consequently, assuming that the user's input userpw2 comprises at least 8 characters, the browser returns to the dividing line between the void to the left of userpw2 and the first character of userpw2 before comparing the rest of pw_regexp with userpw2.
• (?=[\s\S]*[a-z]) checks userpw2 for the presence of a single letter character, lowercase or uppercase (note pw_regexp's i flag); it specifically matches a letter character preceded by zero or more characters (regardless of what they are).
• Similarly, (?=[\s\S]*\d) and (?=[\s\S]*[_\W]) check userpw2 for the presence of a single digit and nonalphanumeric character, respectively.
Like the test( ) commands of Solution #1 above, the (?=[\s\S]*[a-z]), (?=[\s\S]*\d), and (?=[\s\S]*[_\W]) lookaheads are combined and alternated so as to satisfy the 2-out-of-3 character class requirement.
On my computer, positive lookaheads are supported by Netscape 7.02 but not by MSIE 5.1.6, which promptly throws an "Unexpected quantifier" compilation error when it hits the ? character in the first lookahead.
Giving credit where credit is due, my pw_regexp pattern borrows from some of the password patterns posted at RegExLib.com - here is a typical example:
var J_Samuel = /^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?!.*\s).{4,8}$/;
As shown above, these patterns contain lookaheads using a period to represent a generic character; my tests of the patterns at the RegExLib.com site were successful but I couldn't get them to work in a SimpleText file on my hard disk until I substituted [\s\S]'s for the periods. (FYI: the use of 's for forming plurals in isolated cases is discussed here.)
As a final aside, I sent to FirstGov.gov, "The U.S. Government's Official Web Portal," an email asking, "Has the federal government ever issued official recommendations for choosing strong passwords for computer accounts?" I was directed to this page hosted by the United States Computer Emergency Readiness Team (but originating from Carnegie Mellon University).
OK, that'll do it for our discussion of data validation, at least for the time being. In the next post, we'll return to the HTML Goodies JavaScript Primers series and its final Primer #30.
reptile7
Actually, reptile7's JavaScript blog is powered by Café La Llave. ;-)