How to Use a JavaScript Query String Parser - Page 2 | WebReference

How to Use a JavaScript Query String Parser - Page 2


[previous] [next]

How to Use a JavaScript Query String Parser [con't]

A query string represents an associative array, or an "object" in JavaScript:

This can be a bit more complicated. Remember repeated roots of polynomials from College Algebra? Form fields and query-string fields can also be repeated.

Next, one has to decide what to do with empty fields which don't have an equal sign. Some Web developers use these fields to indicate a boolean value. These are never generated by any browsers that I know of from submitting a form, but if you want to implement them in your own Web site's links, that's fine.

Some people who create their own links don't want to write & which is the HTML entity for an ampersand, so they substitute a semicolon like this:

It's easier to type, but it's not standard. Anything that isn't standard can have undefined behavior depending on whether it's your own Web site responding as you developed it to respond, or someone else's.

Examples like these two which don't conform with the query-string format are difficult to work with. There isn't one right decision on how to handle them. For example, if we decide to equivocate secure with secure=1 then we can't tell whether someone typed "1" into a box named secure or if they clicked the button on our form that said use secure processing mode. Situations can come up where bad results happen from the safe-seeming assumption that those would be equivalent.

It's true that browsers generate standardized query strings for submitting form data represented as name1=value1&name2=value2&name3=value3, etc., but as far as HTTP is concerned, a query created for your own Web site applications doesn't have to conform to the format of a standardized query string.

So our query-string parser, and our scripts that use it must expect an associative array where some fields may be repeated, like this:

URL Encoding

You may have noticed the plus sign in Joseph+Myers. This plus sign is part of the URL Encoding process which is necessary for arbitrary data to be encoded in a URL. If data has been URL encoded, it simply means that most byte values are replaced by a hexadecimal %xx representation. However, a space occurs so often that it's replaced by a single plus sign rather than %20 in order to compress the amount of data being transferred. (Of course, a plus sign and a percent sign have to be encoded (%2B and %25, respectively) to avoid conflicting with plus signs and percent signs which mean something else in the URL-encoded data format.)

URL encoding is so important that for the last ten years all versions of JavaScript have provided the functions escape() and unescape() to encode and decode strings into URL-encoded data format.

Notice that we made both fields into a list or subarray for consistency. No Web developer should ever become so comfortable that they forget about users who might submit multiple values for a form field.

Also, we want to avoid one of the worst mistakes that I've seen in the DOM (Document Object Model) of ECMA-Script 262 / traditional JavaScript. In that model you would access a form element's value by saying document.formname.elementname.value if the form contained only one instance of elementname, but you had to use the reference document.formname.elementname[0].value if the form contained more than one.

In other words, a form element can arbitrarily change its data type in the document object model from an array-type object to a non-array type object, depending on the contents of the HTML. The problem is that the two methods of accessing a form element are incompatible with each other. Since I write scripts which are used for more than one form on more than one Web site, most of my form programming has to include several extra lines to work around this bug in the JavaScript standard.

Even though it's more complicated for simple purposes, it's a better to remind ourselves that all fields received from a query-string might be multiple-valued. Sometimes they have a single value, but inherently the data type is an array.

Code

The code is short and sweet, like this:

The original JavaScript query string parameter parser code is also available for download.

Note: if you print this script out as a here document in a Perl script, you need two backslashes before the + sign instead of one.


[previous] [next]