HTML 5 is introducing several new features to the web such as multi-threaded JavaScript, cross document messaging and local storage. In case you are looking for an overview of what HTML 5 promises, visit HTML 5. Today, we will look more closely at the offline application caching feature of HTML 5.
Offline Application Caching
All browsers have some kind of caching mechanism in place, but to be honest, they don't always work. You browse though a site on your laptop and then shut the laptop. After a while, you open up your laptop and click the Back button in the browser hoping to see the previous page that was opened. However, as you are not connected to the internet and the browser didn't cache the page properly, you are unable to view that page. You then click the Forward button thinking that at least that page will load, but it doesn't. You need to reconnect to the internet to be able to view the pages.
Until HTML 4, the only work around was that the user had to save each page individually. HTML 5, thankfully, provides a smarter solution. While building the site, the developer can specify the files that the browser should cache. In fact, on each page, you can specify which documents should be cached. So, even if you refresh the page when you are offline, the page will still load correctly. This sort of caching has several advantages.
-
offline
browsing
As the name indicates, the user will be able to browse through the site even when he is offline. -
speed
Files that are cached locally will load much faster. Usually style sheets are shared across all pages of a website. The first time you load a page from a website, it will take some time to download the style sheet, but when you click on other pages, the browser won't need to download the file again. -
reduced
load on server
Every time you load a page that has some cached elements, the browser will poll the server to check if the cached file has been updated; if it hasn't, then it won't download it. By doing so, the load on the server is considerably reduced.
How It Works
manifest
attribute on the html
element. The attribute takes a URI
to the manifest, which contains the rules for caching.
This is what the manifest.cache
file typically looks like:
The cache manifest has three section headers:
- CACHE
- NETWORK
- FALLBACK
Note that the MIME type of the manifest file is text/cache-manifest
. You might need to
add a custom file type extension binding to Apache (or whatever web server you
are running) or specify the mime-type, for instance using the PHP header
directive.
Files listed under CACHE
will be cached after they are loaded; while the ones under NETWORK
are said to be white-listed.
What this means is that they require a live connection to the server. If the
user isn't connected to the server, the browser should not use the cached
version instead.
The FALLBACK
section
contains entries that provide a backup strategy. If the browser is unable to
retrieve the original content, the fallback resource will be used. In the
example above, we display a static image in case the dynamic one is
unavailable.
The last line in the NETWORK
section contains the path to a folder to ensure that requests to load resources
contained under /api
will
bypass the cache and always fetch the resource from the server.
In the manifest, any line starting with # is treated as a comment. Other
than increasing the readability of the code, comments have another use in the
manifest. Let us say you have specified that masthead.png
should be cached; but you have updated the image. Now as the cache is updated
only when the manifest changes, the user will continue to see the old image
that was cached. You can do this by changing part of the manifest; so a good
way of doing it is incrementing the version number every time you update a
resource.