Thursday, April 22, 2021

# Caching on the web

A cache improves response times by saving the results of an expensive operation for reuse on a subsequent call. We will explore several cache types1 in the context of a simple web application in order to understand how they make web pages load faster and the challenges associated with implementing them.

# Example

The example request for a product page would look like the following:

1. A client (e.g. web browser) makes a request for the page
2. Server retrieves product data from the database
3. Server generates a HTML page from the data and sends it to the client
4. The client parses the HTML page and requests any static assets referenced (e.g. CSS, JavaScript or images)

# Application Cache

We are finding the query to the database can be slow. An option to solve this could be to implement a cache-aside strategy2:

1. The client makes a request for the page
2. Server attempts to retrieve the product data from the cache
3. The data is not in the cache, so the server retrieves the data from the database
4. The data is then saved into the cache to be reused in the next request
5. Server then proceeds as usual to generate the HTML and send it to the client
1. The client makes a second request for the page
2. Server attempts to retrieve the product data from the cache
3. The data is in the cache and is returned to the server without ever hitting the database
4. Server then proceeds as usual to generate the HTML and send it to the client

## Cache invalidation

If we do not have a strategy to invalidate entries in the cache, we will end up serving stale data to the client. There are two primary strategies:

### Invalidate on write

When a new piece of data is written to the datasource behind the cache, we invalidate the cache entry. We can also implement a write-through cache3 whereby the new data is simultaneously written to the cache.

This is an effective strategy to minimise stale data being stored in the cache, but requires the server to either be performing the write or be notified of when a write occurs in order to trigger the cache invalidation.

### TTL (Time To Live)

When an entry is added to the cache, an expiry date is also set. Upon expiry, the entry is removed from the cache.

This is suitable when the clients have a higher tolerance for stale data and we do not have a write trigger to invalidate on write.

## Implementation details

There are a number of caching strategies4 that can be implemented, cache-aside2 being one of them.

The technology used for the cache must be faster than the underlying datasource, i.e. database. These are some common approaches for different use cases and scale:

• Local cache
• Any key-value store that is local to each instance, so no sharing
• Simple to setup as most languages offer the concept of a key-value store
• Shared cache
• e.g. redis5 or memcached6
• Lifts the key-value store to be shared across all instances
• Technology specific
• Databases like ElasticSearch7 and Postgres8 have their own caching mechanisms that can be configured

# Content Delivery Network

The web application is slow to load for European customers who are making requests to our US server. A Content Delivery Network (CDN) can greatly improve delivery speed of files to these distant clients by caching them closer to those clients. When using a CDN, typically 1 client in a region will incur the latency of making a trip all the way to the US server for the files, then all subsequent requests by in the region will get the files from the CDN.

Although we can cache any files in a CDN, it is much more suited to static content like images, JavaScript, CSS. Given how distributed and wide this network is, it can be costly and slow to invalidate files that changes often, such as new HTML pages generated by the server when a new product is added.

Given a CDN setup to only cache static content and not HTML:

1. The client makes a request to the CDN for the HTML page
2. The HTML page does not exist in CDN, so CDN forwards the request to the server to generate the HTML to send to the client
3. The client parses the HTML and proceeds to fetch the styles.css file from the CDN
4. The styles.css does not exist in the CDN, so the request is again forwarded to the server to get the file
5. The styles.css file is first saved in the CDN for reuse, then returned to the client
1. The client makes a subsequent request to the CDN for the styles.css after parsing the HTML page
2. This time the CDN has the styles.css cached and returns it to the client immediately without involving the server

## Cache Invalidation

Unlike our server, where we can implement custom cache invalidation, most commercial CDN offerings only allow us to configure a TTL for certain files.

A common strategy for web applications is:

• Do not cache .html files as it will be our reference point to other static files, that is a TTL of 0
• Cache everything else with a large TTL (e.g. 1 year)

### Cache busting

If we cache our styles.css file for a year, it would be difficult to update that file without invalidating the cache. An option would be to simply update the file name to styles-v2.css and update the reference in the HTML - but this is tedious.

Cache busting9 is an automateable technique of renamig the styles.css for us at build time, appending a unique hash (e.g. md5) to the file name, e.g. styles.s93lkkd.css. Whenever we make a change to styles.css a new hash and filename is generated, which has not been cached before - thus busting the cache.

## Implementation

These are some of the largest CDN providers:

• CloudFront10
• CloudFlare11
• Fastly12

# Client Cache

We are incurring high traffic costs from the CDN due to the same client repeatedly downloading the same files. Instead of hosting our own cache and incurring traffic costs, a lot of clients (e.g. browsers) come with a cache.

A lot of the same princples from managing a CDN applies to client caching - the only real difference is the cache is not shared with other clients.

## Cache Invalidation

Cache invalidation is done using TTLs. Since the TTL for a file is configred once through headers, it is not impossible to change once set, which could result in clients never getting updates. This is different to CDNs and application caches, which are controlled by us and typically provide a way to manually invalidate the cache.

A similar strategy to CDN can be used, where we do not cache the .html, but cache everything else with the option to invalidate through cache-busting.

## Implementation

• HTTP Cache-Control header13
• max-age behaves much like a TTL
• Service workers14 acts as a proxy between the browser and the network. It can do cool things like caching, background fetching, etc. - but a misconfiguration can be difficult to fix.

@TODO