reptile7's JavaScript blog
Sunday, August 20, 2017
 
Everything Counts Regardless of Amount
Blog Entry #378

BurnBlade's Digital Clock script is preceded by a Number of Letters script, which was authored by Greg Bland. The Number of Letters script
(a) counts the characters of an <input type="text"> box's value and then
(b) displays the count on an alert( ) box.

The Number of Letters script's numbletters.txt code page at Java Goodies is gone although you can still get the code at the corresponding JavaScript Goodies page. The aforelinked numbletters.html script page features a demo that does what it is supposed to do but also includes some strangeness that we will sort out below.

HTML interface

At the numbletters.html page the user is greeted by a theform form that holds two controls:
(1) a words text box and
(2) a push button.

<body>
<form name="theform">
<input type="text" name = "words" value = "The most letters this can do is 255" size = "40"></textarea>
<input type="button" value="Check"onclick="check( );">
</form>


• SGML-wise - per the STag, Attribute, and Eq productions in the XML 1.0 specification - the spaces flanking the = character for the <input type="text">'s name, value, and size attributes are legit but the absence of a space between the <input type="button">'s value and onclick attributes is not, although the latter doesn't cause any problems with the browsers on my computer.

The user does or does not enter some new text into the words box and then clicks the button, thereby triggering a check( ) function, which we'll get to shortly.

The "Number of Letters" description notwithstanding, the user's input can comprise alphabetic letters, numerals, symbols, and/or spaces.

The words box's initial value, The most letters this can do is 255, implies that the user can input up to 255 characters but not more than that. I don't know where the 255 number comes from. In theory, the default words maxlength is an unlimited number; in practice, inputting millions of characters may cause your browser to hang.

As you can see, there's a stray </textarea> tag up there; if you are going to enter a lot of stuff, then a textarea field would indeed be a better choice.

Check it

The check( ) function begins unproductively:

<script language="javascript">
/* ...Authorship/copyright notice... */
function check( ) {
    var l = document.theform.words.length;


The HTML DOM's HTMLInputElement interface doesn't have a length attribute, and l accordingly returns undefined; none of check( )'s subsequent operations calls on l, however.

Next we have the one part of check( ) that makes sense:

window.alert("You typed in " + document.theform.words.value.length + " letter(s).");

String objects do have a character-counting length property, and the document.theform.words.value string's length gives us the number we want, which with a bit of additional text is outputted to the user. If you don't touch the words field and click the button, you'll get the The most letters this can do is 255 length, 35; if you clear the The most letters this can do is 255 string and don't replace it with anything and click the button, you'll get 0.

The check( ) function concludes with two purposeless conditionals

if (document.theform.words.value.length != 1) {
    window.alert("Invalid number of letters, try 1 letter."); }
if (document.theform.words.value.length == 1) {
    window.alert("You learn fast!"); } }
</script>


that output messages if the user's input is not just one character and is one character, respectively. (Perhaps Greg's 'purpose' was to annoy the user, but we really shouldn't be doing that, should we?)

Keep the document.theform.words.value.length reading, chuck the rest.

wc it

With some help from our good friend Mr. Regular Expression, we can easily adapt the script to counting the words and line breaks in the input à la the Unix wc command.

Words

If we broadly define a "word" as a string of characters containing at least one vowel and no white space, then we can get the number of words in the words value via the following code:

var re_spaces = /\s+/;
var re_vowel = /[aeiouy]/i;
var word_count = 0;
var input_words = document.theform.words.value.split(re_spaces);
for (var i = 0; i < input_words.length; i++)
    if (re_vowel.test(input_words[i])) word_count++;
window.alert("Number of words: " + word_count);


The re_spaces regexp pattern matches one or more white space characters; the re_vowel regexp pattern matches a single case-insensitive ASCII vowel character. We split( ) the words value at each re_spaces separation to give an array of input_words, which are then iteratively test( )ed against the re_vowel pattern; each positive (true) test increments a word_count.

Line breaks

Getting the number of line breaks is even easier:

var re_endofline = /(\r\n|\n|\r)/g;
var linebreak_array = document.theform.words.value.match(re_endofline);
var linebreak_count = linebreak_array ? linebreak_array.length : 0;
window.alert("Number of line breaks: " + linebreak_count);


The re_endofline regexp pattern matches three types of line endings:
(1) \r\n, for the Windows platform;
(2) \n, for Unix-based platforms; and
(3) \r, for the 'classic' (pre-OS X) Macintosh platform.
The pattern's g flag enables us to match each occurrence of an ending throughout the input; we can retrieve those endings as an array via a match( ) operation.

If there are no line breaks in the input, then the linebreak_array is null, in which case the linebreak_count is set to 0.

Everything but the white space sink

The regular expression-interpreting submodule of the browser's JavaScript engine does not treat non-ASCII vowels (e.g., å, ê, ì) and symbols as word characters but wc itself certainly does, so if you want to be able to pick up a word like für (maybe you've just learned how to play "Für Elise" on the piano) and you don't mind picking up 'words' like 255 and $$$ (wc would see them as words), then you can do that with:

var re_nonspaces = /\S+/g; /* \S matches any non-white space character. */
var word_array = document.theform.words.value.match(re_nonspaces);
var word_count = word_array ? word_array.length : 0;
window.alert("Number of words: " + word_count);


Demo

Time for a demo, eh? Enter whatever you like into the textarea field below and then click the button to get the number of line breaks, words, and characters (including white space characters) in your input. A button is provided for your convenience.




Check the source of the current page for the full coding.
We'll take up the CCC sector's Money Conversion Script in the following entry.

Comments: Post a Comment

<< Home

Powered by Blogger

Actually, reptile7's JavaScript blog is powered by Café La Llave. ;-)