Caching API responses offers snappier experience to your users who frequent the same pages often. It is critical if you are building an offline-capable web app. Caching RESTful APIs is straightforward as each endpoint is unique. GraphQL still follows the same principle but there are some caveats.

What is Caching

I know this is redundant, but I want to make you have the approach to understanding caching as I do which is relevant later on.

Caching to me is not doing something twice, that's simple enough but let's define the "something" in question. How could you identify "something" so you can avoid doing it again? In other words: In order to cache, we need to establish the uniqueness of the work done. The uniqueness of the work is a prerequisite to be able to cache properly.

In most cases, we use the term cache key to refer to the identifier of our previously done work. and that works fine when the cache key is known beforehand or when its known in static-time. However with unknown cache keys (run-time) you cannot predict if two "somethings" will use the same cache key or not.

And that brings us to the most famous joke/wisdom in software engineering:

There are only two hard things in Computer Science: cache invalidation and naming things. -- Phil Karlton

To me, caching runtime stuff combines both of these problems. Because we have no idea what the keys will look like we need to be able to generate it in a deterministic manner and it needs to be unique enough to be viable.

I won't be covering server-side caching as that's a different story, we are only interested in client-side caching hence "caching responses".

HTTP Cache

Browsers already have something called HTTP Cache which caches responses based on their url and their Cache-Control and Expires headers. However this doesn't work well with POST requests as they do not carry the body of the request in the URL. Unlike the GET requests and their query parameters.

The GET requests uniqueness can be easily decided by inspecting the URL and the query parameters. For POST requests, the URL is always the same which is problematic since we cannot derive uniqueness from the URL alone.

There are other ways to do HTTP caching with other headers like ETag and Last-Modified-At and If-Modified-Since but that requires proper server setup.

Let's see what can we do on our own on the client-side.

Caching Responses

In the modern web, we can cache responses returned from APIs using either of two approaches:

  • Caching as a part of the application logic layer.
  • Caching as an enhancement layer.

While both achieve the same results, they are very different on how to implement them which each carries their own Pros/Cons.

Logic Layer Caching

It means that you are writing the caching logic right into your application, it's explicit and very deliberate. Consider this snippet as an example:

jsfunction cachedFetch(url) {  // Try to find the previously fetched response  const cached = localStorage.getItem(url);  if (cached) {    // We found it, ah yis!    return JSON.parse(cached);  }  // Getting it the boring way.  return fetch(url)    .then((res) => res.json())    .then((response) => {      // Cache it on the way out!      localStorage.setItem(url, JSON.stringify(response));      return response;    });}cachedFetch('');

We try to find the response we already got and if we don't we fetch it from the network. I used the localStorage API to keep it simple but you can use anything else you want. I recommend using indexedDB for increased flexibility if you can afford its un-initiative API.

In this approach, our caching logic is tightly coupled to our application logic and while it is simple, if our caching layer were to break. Our fetching logic will also break.

You can extract the caching logic to be after the fetching part, but the same problem will exist. However as an added benefit, since the caching logic is implemented as part of the application you gain the flexibility of handling responses as data which allows you to do any sort of processing on them and caching the finished data for future use.

Caching as an Enhancement Layer

This one is different, the main idea is that your application doesn't know if the requests are being cached or not and it doesn't care. This can be done with the service worker technology which we use to build PWAs.

The service worker's fetch event allows us to intercept network calls and gives us the option to let it go through or respond with our own custom response. Another interesting API is the caches API which allows us to cache request/response as a key/value in some kind of a storage called Cache Storage. With those two APIs we intercept any outgoing requests and respond with whatever responses we got in the cache.

It requires more steps to set up and it can get really complicated but I will keep it simple:

js// sw.jsconst API_CACHE = 'apiCache_v1';addEventListener('fetch', (e) => {  // Not an API request, avoid dealing with it.  if (!e.url.startsWith('/api')) {    return;  }  // Tell browser that we got this!  e.respondWith('API_CACHE').then((cache) => {      // Check if we already have that request cached.      return cache.match(e.request).then((match) => {        if (match) {          // Found it!          return match;        }        // Looks like this is the first time, make the request!        return fetch(e.request).then((response) => {          // Cache a clone of the response.          return cache.put(e.request, response.clone()).then(() => {            // Respond with the response!            return response;          });        });      });    })  );});

If you are not familiar with the fetch event you should probably brush up on it, what is great about this approach is that our service worker is an enhancement to our application, meaning if it breaks or if not even supported, the app will continue to function normally.

This is implicit, but that's not necessarily a bad thing as it allows you to build your app logic without having to maintain the caching layer in the same way as with the first approach, this is loose coupling of sorts, and it will appease the single responsibility gods.

This approach however forces you to deal with API responses as streams, and if you noticed the response.clone call, this is because a stream can only be used exactly once so in order to return it to the main app and cache it, we need a copy to do both.

RESTful Caching

RESTful APIs play really well with HTTP Caching that's previously mentioned, because you can cache all the GET requests reliably, and POST requests generally cause a mutation on the server meaning it should not be cached which works great. That's if you follow the proper semantics of RESTful APIs and respecting the usage of each verb.

There is not a lot to say here, if you have access to your backend. Just slap on Cache-Control headers on the GET endpoints you want to cache and it will just work, you can also use the approaches we implemented earlier and it will also work perfectly.

Let's re-phrase it one more time, It's easy to cache RESTful GET requests because their uniqueness or rather their cache key can simply be the request URL itself, which makes it compatible with service worker basic caching that we implemented earlier.

To summarize, What makes two requests in RESTful APIs exactly the same? Well, if they have the same URL and query/body parameters.

GraphQL Caching

GraphQL is fundamentally different to REST, first it is "transport agnostic". Means it doesn't care if it was implemented with HTTP or WS or whatever protocol you want it only cares about payload as they describe what is needed from the GraphQL server.

And typically its nicer to use POST to send GraphQL queries because POST requests have a body, and that body can simply be JSON. However that means that our request URLs are no longer unique per query. The caches API doesn't also play well with "payloaded" requests, it will only match the URL only which doesn't work for us.

We have to find more factors to derive a unique and deterministic cache key, if we analyze a simple query like this one:

graphqlquery Posts {  posts {    id    title  }}

We can assume that the query body or rather the "query text" itself is to be factored in our cache key. But we are missing a piece of the puzzle, check this query for example:

graphqlquery Post($id: ID!) {  post(id: $id) {    id    title  }}

This query requires an $id variable, meaning even if the text is the same for two requests, their results will differ if the $id variable is different. That also means variables needs to be factored in as well.

If we ask the same question as we did before: What makes two GraphQL queries exactly the same?

We will reach that the uniqueness for a GraphQL request can be derived from:

  • The Operation Body (Query body, Operation Name, Inline Arguments).
  • The Operation Variables (Passed Variables).

I won't be discussing the first approach (Caching in the Logic layer) as it will be mostly the same as before. The second approach is more interesting, as again we find ourselves dealing with requests as streams which will require some knowledge about both the Request and Response objects in JavaScript.

Cache me if you can

Let's see if we can get around the limitations we have in our service worker, first we need to only handle the GraphQL requests:

jsconst QUERY_CACHE_KEY = 'CACHE_V_0';self.addEventListener('fetch', (e) => {  if (isGraphql(e.request)) {    handleGraphQL(e);  }});function isGraphql(request) {  // Change this to whatever that works for you.  return request.url.startsWith('');}function handleGraphQL(e) {  // TODO: Implement this}

Now that we are only applying our logic to our GraphQL requests, let's get down to caching. We need to extract the information we need to generate a unique cache key, so we will be needing the query, and its variables.

Because the request object is a stream, we could parse its body just like Response objects.

jsfunction handleGraphQL(e) {  const generateQueryId = e.request    .clone()    .json()    .then(({ query, variables }) => {      // Now we have our query and variables, concat them and use the      // result string as a cache key.      return `${JSON.stringify({ query, variables })}`;    });}

We cloned the request because just like responses, they can only be used once. So we clone it early to avoid causing problems for the actual fetch logic. One problem we have is that while our cache key works fine, it is not a valid caches key, which accepts URLs or URL-like strings. Meaning we need to make our cache key looking more like a URL:

jsfunction handleGraphQL(e) {  const generateQueryId = e.request    .clone()    .json()    .then(({ query, variables }) => {      // Now we have our query and variables, put them in a fake URL.      return `https://query_${JSON.stringify({ query, variables })}`;    });}

This isn't viable either, because our queries can be become too large, so our variables. Also most queries won't output a URL friendly string. So the last step is that we need to hash the query and variables so they output a unique small value that can be used in a URL, let's say a number!

Doing a quick googling/stackoverflow search for "hashing strings based on their content" will yield some version of this:

jsfunction hash(str) {  let h, i, l;  for (h = 5381 | 0, i = 0, l = str.length | 0; i < l; i++) {    h = (h << 5) + h + str.charCodeAt(i);  }  return h >>> 0;}

This is actually a snippet I use in villus to cache GraphQL requests, this is based on the Dbj2 algorithm and I first encountered it in urql codebase, you don't need to know how it works, all you need to know is that its very fast and reliable for our purpose.

So back to our handleGraphQL function:

jsfunction handleGraphQL(e) {  const generateQueryId = e.request    .clone()    .json()    .then(({ query, variables }) => {      // Mocks a request since `caches` only works with requests.      return `https://query_${hash(JSON.stringify({ query, variables }))}`;    });}

This will create our fake URLs based on both their queries and variables and will yield the same result for the same exact query/variables combination. You can make it more reliable by sorting the variables keys alphabetically but let's stop here.

Now its business as usual, we check the cache for our fake URL and if it exists we will serve it. Otherwise we will let the request go through the network:

I will be refactoring our code a little bit with promise chaining and async/await to make it more readable.

jsfunction handleGraphQL(e) {  const generateQueryId = e.request    .clone()    .json()    .then(({ query, variables }) => {      // Mocks a request since `caches` only works with requests.      return `https://query_${hash(JSON.stringify({ query, variables }))}`;    });  e.respondWith(    (async () => {      // get the request body.      const queryId = await generateQueryId;      // Open the cache and find a match.      const matched = await'API_CACHE_V1').then(cache => cache.match(queryId));      if (matched) {        return matched;      }      // Couldn't find it, get it from the network      return fetch(e.request).then(res => {        return'API_CACHE_V1').then(cache => {          return cache.put(queryId, res.clone()).then(() => {            return res;          });        })      });    })()  );

And that's it, we tricked the cache API into caching our own fake URLs that we will use as responses for the real requests done by the app. As an added bonus we need to tackle the following problems:

  • This implementation caches every graphql request, We need to ignore mutations, and maybe some specific queries.
  • We are not updating the cached items at all, cache first approach is not great here. Maybe we could do a stale-while-revalidate approach instead?

First, let's ignore unwanted operations like mutations and maybe the queries related to the user cart items and user auth data. A simple blacklist will do:

jsconst exclude = [/query UserCart/, /mutation/, /query Identity/];const generateQueryId = e.request  .clone()  .json()  .then(({ query, variables }) => {    // skip blacklisted queries.    if (exclude.some((r) => r.test(query))) {      return null;    }    // ...  });

And we need to update our logic to take null ids into account:

jse.respondWith(  (async () => {    // ...    const matched = queryId && (await'API_CACHE_V1').then((cache) => cache.match(queryId)));    if (matched) {      return matched;    }    // ...  })());

Great, now our function won't aggressively cache mutations which doesn't make any sense as they will cause change to our system, also we excluded some queries that we always need to be fresh from the network.

Currently our approach falls under the Cache-First strategy of doing things, which while is great for stuff like static assets or images, it is not suitable for stuff like API responses. Imagine the user fetching your product catalogue exactly once and never seeing the new items ever again, that would be disastrous for your business.

Instead let's do a leaner approach, which is called stale-while-revalidate. Which is just an enhanced cache-first, it boils down to:

  • If in cache serve cached response.
  • Update the cached response from the network for the next visit/call.

That means we will always make a network request, but we will be quickly serving the cached responses, and when the user executes the query a second time, They will get the new data. This is the best of both worlds and makes sense for non-critical API responses.

How would we go about doing this? Since we will need to make a network request either way, let's start with that:

jsfunction handleGraphQL(e) {  const exclude = [/query GetCart/, /mutation/, /query Identity/];  const generateQueryId = e.request    .clone()    .json()    .then(({ query, variables }) => {      // skip mutation caching...      if (exclude.some((r) => r.test(query))) {        return null;      }      // Mocks a request since `caches` only works with requests.      return `https://query_${hash(JSON.stringify({ query, variables }))}`;    });  // Make the network request, but don't wait for it.  const fromNetwork = fetch(e.request);  // TODO: Now what?}

Notice that we didn't await for the response, this is because it is critical to call e.respondWith synchronously, otherwise the browser will execute the network anyways. The next part is mostly the same as previously, but I will refactor the cache fetching logic to make stuff more readable:

jsconst QUERY_CACHE_KEY = 'CACHE_V_0';async function fromCache(request) {  const cache = await;  const matching = await cache.match(request);  return matching;}function handleGraphQL(e) {  // ...  // ...  const fromNetwork = fetch(e.request);  e.respondWith(    (async () => {      // get the request body.      const queryId = await generateQueryId;      const cachedResult = queryId && (await fromCache(queryId));      if (cachedResult) {        return cachedResult;      }      // Respond with a clone of the request.      return fromNetwork.then((res) => res.clone());    })()  );  // TODO: Now what?}

You might have noticed that I have removed the caching the response part, this is because we need to be able to respond the request as fast as possible and waiting for the network to finish seems redundant since we need to update the cached response. This is where e.waitUntil comes in.

e.waitUntil is different than e.respondWith as the former allows us to do some additional asynchronous work without blocking the request/response cycle.

jsconst QUERY_CACHE_KEY = 'CACHE_V_0';async function fromCache(request) {  // ...}function handleGraphQL(e) {  // ...  // ...  const fromNetwork = fetch(e.request);  e.respondWith(    (async () => {      // ...    })()  );  e.waitUntil(    (async () => {      // once the network response finishes, clone it!      const res = await fromNetwork.then((res) => res.clone());      // Get the query id      const queryId = await generateQueryId;      if (!queryId) {        return;      }      // Cache the response.      const cache = await;      await cache.put(request, response);    })()  );}

Note that we used the fromNetwork and generateQueryId promises a second time, since promises are immutable, they can be "forked" and this allows us to re-use settled results as much as we need, it is some sort of checkpoint.

Note that we also clone the response in both respondWith and waitUntil callbacks, this is because we have no idea which of them will execute first and as a result we have no clue which will use the original response so that the other may clone it to avoid using the same response twice, cloning them both guarantees we won't into this issue.


We managed to implement a GraphQL caching in service workers with the caches API in a completely decoupled manner from our app logic, which allows us to maintain each piece independently from the other one.

You can find the entire sw.js file here in this public gist.

Join The Newsletter

Subscribe to get notified of my latest content

Other articles