Extreme HTML Optimization: URL Abbreviation
|
Extreme HTML Optimization
URL Abbreviation
One of the most effective techniques you can use to shrink your page is URL abbreviation using Apache's mod_rewrite module. First seen on Yahoo, the busiest page on the Web, URL abbreviation substitutes a short redirect URL (like "r/pg") for a longer one (like "programming/") using the mod_rewrite feature of Apache. This technique is especially effective for front pages which typically have a lot of links. We saved 5-6 K on our front page using abbreviated URLs.
To set up redirects first have your IT department install mod_rewrite on your Apache server. They'll need to edit one of your server config files. The following srm.conf commands show where to look for your rewritemap, and what the rewrite rule is:
RewriteMap abbr dbm:/apache/abbr_webref
RewriteRule ^/r/([^/]*)/?(.*) ${abbr:$1}$2
[redirect=permanent,last]
Next enter the abbreviations you want separated by tabs in the above-referenced rewritemap file ("/apache/abbr_webref.txt" in our case). The following is a sample from our current redirects:
b dlab/
d dhtml/
g graphics/
h html/
p perl/
x xml/
3c 3d/lesson
dd dhtml/dynomat/
ddh dhtml/dynomat/hiermenus3/
dc dhtml/column
gc graphics/column
...
i2 index2.html
au authoring/
in internet/
iv interviews/
mm multimedia/
pg programming/
pr promotion/
th tools/html/
tj tools/javascript/
tl tools/
tb tools/browser/
tbj tools/browser/javascript.html
hl headlines/
hn headlines/nh/
s services/
sd services/dns/
sf https://forums.webdeveloper.com/
ss scripts/
sg services/graphics/
sr services/reference/
...
ab about.html
ns new/submit.html
nc new/contest.html
i https://www.internet.com/
ic https://www.internet.com/corporate/
...
is https://www.internet.com/sections
isa https://www.internet.com/sections/asp.html
isc https://www.internet.com/sections/careers.html
isw https://www.internet.com/sections/webdev.html
isd https://www.internet.com/sections/downloads.html
isi https://www.internet.com/sections/international.html
isx https://www.internet.com/sections/linux.html
...
iswn https://www.internet.com/sections/win.html
iswl https://www.internet.com/sections/wireless.html
en https://e-newsletters.internet.com
enm https://e-newsletters.internet.com/mailinglists.html
icl https://www.internet.com/corporate/legal.html
icp https://www.internet.com/corporate/privacy/privacypolicy.html
ert https://www.earthweb.com/
fkt https://www.flashkit.com/
...
So to link to our privacy policy all I have to do is now type <A HREF="/icp">Privacy Policy</A> saving beaucoup bytes. Notice that I use shorter abbreviations/redirects for the more frequently used URLs, like our experts ("r/d" = "/dhtml/"). I save even more space with the tutorial abbrevs that automatically append the column number after the fragment URL thus ("r/dc/48" = "/dhtml/column48/"). Major directories are two characters, and start with the same letter when possible ("s" for services, "h" for headlines etc.). Internet.com links start with an "i" and other sites all have three-letter abbreviations for consistency.
To add more abbrevs you just edit the rewritemap file, and run something like:
"create_dbm abbr_webref abbr_webref.txt"
in the "/www/misc/redir/" directory. Remember to use TABS to separate the two fields, and test before deploying. Yahoo uses redirects like this on their home page to great effect (load time is nearly instantaneous). This technique alone saves us 5-6 bytes off the 24K hand-optimized home page.
Quotes
Some quoted attributes are optional in HTML 4.01, and can be safely ommitted from your HTML. Note that XHTML requires all attributes be quoted, and complete (checked="checked" etc.). Attributes must be quoted if they contain any character other than letters (A-Za-z), digits, hyphens, and periods. So you can do this:
<IMG SRC="t.gif" WIDTH=1 HEIGHT=1>
but not this:
<TABLE WIDTH=100%>
While not technically valid HTML, quotes can be ommitted from A HREFs like this:
<a href=r/pg></a>
The links still work on every browser we tried, and we actually got the front page down to the 13K range using this technique. Here's an example. However, for XHTML all attributes and URLs must be quoted, so we recently switched to quoting URLs to have a valid page (sans the ad code's ampersands).
|
Revised: Mar. 19, 2001