Thursday, April 22, 2021

Caching on the web

A cache improves response times by saving the results of an expensive operation for reuse on a subsequent call. We will explore several cache types1 in the context of a simple web application in order to understand how they make web pages load faster and the challenges associated with implementing them.

Example

The example request for a product page would look like the following:

SEQ_DIAGRAM client Client c1 client->c1 server Server s1 server->s1 database Database d1 database->d1 c2 c1->c2 c1->s1 GET /products c3 c2->c3 s2 c2->s2 index.html c4 c3->c4 s3 c3->s3 GET styles.css s4 c4->s4 styles.css s1->s2 s1->d1 SELECT * FROM products s2->s3 d2 s2->d2 Product[] s3->s4 d3 d4 d1->d2 d2->d3 d3->d4
  1. A client (e.g. web browser) makes a request for the page
  2. Server retrieves product data from the database
  3. Server generates a HTML page from the data and sends it to the client
  4. The client parses the HTML page and requests any static assets referenced (e.g. CSS, JavaScript or images)

Application Cache

We are finding the query to the database can be slow. An option to solve this could be to implement a cache-aside strategy2:

  1. The client makes a request for the page
  2. Server attempts to retrieve the product data from the cache
  3. The data is not in the cache, so the server retrieves the data from the database
  4. The data is then saved into the cache to be reused in the next request
  5. Server then proceeds as usual to generate the HTML and send it to the client
SEQ_DIAGRAM client Client c1 client->c1 server Server s1 server->s1 cache Application Cache cc1 cache->cc1 database Database d1 database->d1 c2 c1->c2 c1->s1 GET /products c3 c2->c3 c4 c3->c4 c5 c4->c5 c6 c5->c6 c7 c6->c7 s6 c6->s6 index.html c8 c7->c8 s7 c7->s7 GET styles.css c9 c8->c9 s8 c8->s8 styles.css s2 s1->s2 s1->cc1 SELECT * FROM products s3 s2->s3 cc2 s2->cc2 null s4 s3->s4 cc3 s3->cc3 s5 s4->s5 cc4 s4->cc4 Product[] s5->s6 cc5 s5->cc5 Product[] s6->s7 s7->s8 s9 s8->s9 cc1->cc2 cc2->cc3 cc3->cc4 d3 cc3->d3 SELECT * FROM products cc4->cc5 d4 cc4->d4 cc6 cc5->cc6 cc7 cc6->cc7 cc8 cc7->cc8 cc9 cc8->cc9 d2 d1->d2 d2->d3 d3->d4 d5 d4->d5 d6 d5->d6 d7 d6->d7 d8 d7->d8 d9 d8->d9
  1. The client makes a second request for the page
  2. Server attempts to retrieve the product data from the cache
  3. The data is in the cache and is returned to the server without ever hitting the database
  4. Server then proceeds as usual to generate the HTML and send it to the client
SEQ_DIAGRAM client Client c1 client->c1 server Server s1 server->s1 cache Application Cache cc1 cache->cc1 database Database d1 database->d1 c2 c1->c2 c1->s1 GET /products c3 c2->c3 c4 c3->c4 s3 c3->s3 index.html c5 c4->c5 s4 c4->s4 GET styles.css s5 c5->s5 styles.css s2 s1->s2 s1->cc1 SELECT * FROM products s2->s3 cc2 s2->cc2 Product[] s3->s4 s4->s5 cc1->cc2 cc3 cc2->cc3 cc4 cc3->cc4 cc5 cc4->cc5 d2 d1->d2 d3 d2->d3 d4 d3->d4 d5 d4->d5

Cache invalidation

If we do not have a strategy to invalidate entries in the cache, we will end up serving stale data to the client. There are two primary strategies:

Invalidate on write

When a new piece of data is written to the datasource behind the cache, we invalidate the cache entry. We can also implement a write-through cache3 whereby the new data is simultaneously written to the cache.

This is an effective strategy to minimise stale data being stored in the cache, but requires the server to either be performing the write or be notified of when a write occurs in order to trigger the cache invalidation.

TTL (Time To Live)

When an entry is added to the cache, an expiry date is also set. Upon expiry, the entry is removed from the cache.

This is suitable when the clients have a higher tolerance for stale data and we do not have a write trigger to invalidate on write.

Implementation details

There are a number of caching strategies4 that can be implemented, cache-aside2 being one of them.

The technology used for the cache must be faster than the underlying datasource, i.e. database. These are some common approaches for different use cases and scale:

  • Local cache
    • Any key-value store that is local to each instance, so no sharing
    • Simple to setup as most languages offer the concept of a key-value store
  • Shared cache
    • e.g. redis5 or memcached6
    • Lifts the key-value store to be shared across all instances
  • Technology specific
    • Databases like ElasticSearch7 and Postgres8 have their own caching mechanisms that can be configured

Content Delivery Network

The web application is slow to load for European customers who are making requests to our US server. A Content Delivery Network (CDN) can greatly improve delivery speed of files to these distant clients by caching them closer to those clients. When using a CDN, typically 1 client in a region will incur the latency of making a trip all the way to the US server for the files, then all subsequent requests by in the region will get the files from the CDN.

Although we can cache any files in a CDN, it is much more suited to static content like images, JavaScript, CSS. Given how distributed and wide this network is, it can be costly and slow to invalidate files that changes often, such as new HTML pages generated by the server when a new product is added.

Given a CDN setup to only cache static content and not HTML:

  1. The client makes a request to the CDN for the HTML page
  2. The HTML page does not exist in CDN, so CDN forwards the request to the server to generate the HTML to send to the client
  3. The client parses the HTML and proceeds to fetch the styles.css file from the CDN
  4. The styles.css does not exist in the CDN, so the request is again forwarded to the server to get the file
  5. The styles.css file is first saved in the CDN for reuse, then returned to the client
SEQ_DIAGRAM client Client c1 client->c1 cache CDN cc1 cache->cc1 server Server s1 server->s1 c2 c1->c2 c1->cc1 GET /products c3 c2->c3 cc2 c2->cc2 null c4 c3->c4 cc3 c3->cc3 c5 c4->c5 cc4 c4->cc4 index.html c6 c5->c6 cc5 c5->cc5 GET styles.css c7 c6->c7 cc6 c6->cc6 null c8 c7->c8 cc7 c7->cc7 c9 c8->c9 cc8 c8->cc8 styles.css cc1->cc2 cc2->cc3 cc3->cc4 s3 cc3->s3 GET /products cc4->cc5 s4 cc4->s4 cc5->cc6 cc6->cc7 cc7->cc8 s7 cc7->s7 GET styles.css cc9 cc8->cc9 s8 cc8->s8 styles.css s2 s1->s2 s2->s3 s3->s4 s5 s4->s5 s6 s5->s6 s6->s7 s7->s8 s9 s8->s9
  1. The client makes a subsequent request to the CDN for the styles.css after parsing the HTML page
  2. This time the CDN has the styles.css cached and returns it to the client immediately without involving the server
SEQ_DIAGRAM client Client c1 client->c1 cache CDN cc1 cache->cc1 server Server s1 server->s1 c2 c1->c2 c1->cc1 GET styles.css c3 c2->c3 cc2 c2->cc2 styles.css cc1->cc2 cc3 cc2->cc3 s2 s1->s2 s3 s2->s3

Cache Invalidation

Unlike our server, where we can implement custom cache invalidation, most commercial CDN offerings only allow us to configure a TTL for certain files.

A common strategy for web applications is:

  • Do not cache .html files as it will be our reference point to other static files, that is a TTL of 0
  • Cache everything else with a large TTL (e.g. 1 year)

Cache busting

If we cache our styles.css file for a year, it would be difficult to update that file without invalidating the cache. An option would be to simply update the file name to styles-v2.css and update the reference in the HTML - but this is tedious.

Cache busting9 is an automateable technique of renamig the styles.css for us at build time, appending a unique hash (e.g. md5) to the file name, e.g. styles.s93lkkd.css. Whenever we make a change to styles.css a new hash and filename is generated, which has not been cached before - thus busting the cache.

Implementation

These are some of the largest CDN providers:

  • CloudFront10
  • CloudFlare11
  • Fastly12

Client Cache

We are incurring high traffic costs from the CDN due to the same client repeatedly downloading the same files. Instead of hosting our own cache and incurring traffic costs, a lot of clients (e.g. browsers) come with a cache.

A lot of the same princples from managing a CDN applies to client caching - the only real difference is the cache is not shared with other clients.

Cache Invalidation

Cache invalidation is done using TTLs. Since the TTL for a file is configred once through headers, it is not impossible to change once set, which could result in clients never getting updates. This is different to CDNs and application caches, which are controlled by us and typically provide a way to manually invalidate the cache.

A similar strategy to CDN can be used, where we do not cache the .html, but cache everything else with the option to invalidate through cache-busting.

Implementation

  • HTTP Cache-Control header13
    • max-age behaves much like a TTL
  • Service workers14 acts as a proxy between the browser and the network. It can do cool things like caching, background fetching, etc. - but a misconfiguration can be difficult to fix.

@TODO

Footnotes

  1. https://aws.amazon.com/caching/

  2. https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/#:~:text=various%20caching%20strategies.-,Cache%2DAside,-This%20is%20perhaps 2

  3. https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/#:~:text=we%E2%80%99ll%20see%20next.-,Write%2DThrough%20Cache,-In%20this%20write

  4. https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/

  5. https://redis.io/

  6. https://memcached.org/

  7. https://www.elastic.co/blog/elasticsearch-caching-deep-dive-boosting-query-speed-one-cache-at-a-time

  8. https://severalnines.com/database-blog/overview-caching-postgresql

  9. https://webpack.js.org/guides/caching/

  10. https://aws.amazon.com/cloudfront/

  11. https://www.cloudflare.com/cdn/

  12. https://www.fastly.com/

  13. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control

  14. https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API