Building a Client-Side Ajax Cache: Caching Algorithms

[next]

Ajax calls can markedly improve a Web site's performance by updating only relevant page elements, while allowing most of the document to remain static. Caching of Ajax content is a great way to push performance gains even further. In many cases, even dynamic elements can be retrieved from the cache instead of making a redundant server calls, as Ajax routines can take advantage of browser caching mechanisms in much the same way as regular page requests. For example, developers can exert some control over what gets cached and what does not by including cache-related directives in the headers of Ajax requests and server responses. The down side to relying on the built-in browser cache is that there are some distinct limitations. For starters, many browsers do not cache dynamic content, so any request to a server-side process is likely to fall by the wayside once it's been retrieved. Other concerns include browser cache size that is insufficient for your purpose and content from other sites crowding yours out. Caching on the server instead comes with its own set of problems, the main one being that it's all too easy to overburden it with megabytes of data. From experience, I can say that any flaws in the caching mechanisms or memory leaks will cause intermittent server crashing — especially when the Web service, application, or data repository, is critical. The solution may be to create of your own cache management class. As we saw in the first installment, it's not that hard to do. All you need is some knowledge of JavaScript and caching algorithms. We'll be covering a little of both those topics today, as we build upon our cache management class.

Some Popular Caching Algorithms

Most popular caching algorithms were developed for CPU memory management. Their main purpose is to help us manage cache size, but there are always other factors involved, including latency, memory usage, and complexity, to name a few. Believe it or not, we have already implemented a caching algorithm in the last version of our cache — the AjaxCacheWithKeyIndexing.js - by adding a maxSize property and by including the following code when ever a new item was added to the cache:

It's called a "First In, First Out" (FIFO) queuing system, because the first item in (the oldest) is the one to go in order to make room for the new item. FIFO is part of the "Least Recently Used" (LRU) family of caching algorithms. As the above code demonstrates, it is a very low overhead means of managing content, as there are so few variables to keep track of. The drawback is that it is only serviceable in the most rudimentary of applications. Therefore, let's soup up our cache to use some more advanced algorithms.

Similar to LRU is the "Least Frequently Used" (LFU) algorithm. As with FIFO, a sorted list is maintained with the sort order based on the time of most recent access. New entries are added at the top of the list, after the bottom entry has been evicted. The only addition is that, in LFU, a cache hit moves the item to the top, pushing all other entries down one position. We can bring the item back to the front of the array using a combination of Array methods. Pushing the item on the front of the array is easy, as we've already seen. The more challenging part is removing an element from the middle of the array. The tradeoff that comes along with control over element positioning is quick retrieval of a particular item; the exact opposite of hashes, which provide lightning quick retrieval but no control over item ordering. JavaScript 1.6 provides the indexOf() function as an easy means of locating an element in an array. It returns the index of the first matching item or -1 if none are found. Not all browsers support JavaScript 1.6 yet, including Internet Explorer 7, but, in the meantime, Prototype has an implementation of this useful function. Once we have the index, we can use the splice() function to extract it from the keyIndex array. It accepts three parameters:

index: The index at which to start changing the array. If negative, will begin that many elements from the end.
howMany: An integer indicating the number of old array elements to remove. If howMany is 0, no elements are removed. In this case, you should specify at least one new element. If no howMany parameter is specified, all elements after index are removed.
element1, ..., elementN: (optional) The elements to add to the array. If you don't specify any elements, splice() simply removes elements from the array.

Here is the code to bring the cached element to the front of the keyIndex array:

Due to the fact that the array operations could take a bit of time to execute, they occur after the results table's contents have been updated.

Maintaining Data Freshness

Like leftovers, there comes a time where have to throw stuff out. Keeping out-of-date data is no good to anybody, even if it does save on retrieval time. The key question to ask yourself when deciding on a best-before date is how critical timely data is to the users of your Web app. As it stands, none of the caching models that we've looked at address this issue. Whatever caching algorithm we use, we still have to deal with data freshness as a separate task. We can maintain data freshness in a number of ways, including:

Attaching a timestamp to each item that goes into the cache. Whenever we retrieve an item, we'd inspect the timestamp to determine if the result is recent enough.
Similarly, we could schedule a periodic service to actively delete stale items.
A browser-server combination which would allow the browser to determine if items are stale. The browser might keep a timestamp for each item and the server provides a service that accepts a timestamp. The server would only send the whole response if the value has changed since that time.
An alternative to timestamps would be a hash function which the browser would run against the cached value and can then be compared by the server against the item's most recent hash value, to see if a change has occurred.
Implement an event-listener model whereby the server announces changes that have occurred. The browser would have a listener that would actively listen for such changes, and delete any items that have changed.

The last three strategies rely on server routines, which are beyond the scope of this series. Keeping things on the client-side, here's how we could go about implementing the first two suggestions.

[next]

Building a Client-Side Ajax Cache: Caching Algorithms

Some Popular Caching Algorithms

Maintaining Data Freshness

Find a programming school near you