Creating Cache Groups with vary_cache_on_function

Overview #

↑ Top ↑

The Scenario #

  • Batcache speeds up by reusing one response for many requests.
  • A typical URL is a resource that provides a consistent response. This is easy to cache.
  • For some URLs, the responses vary depending on the request. This is harder to cache.
  • Variants created for logged-in users are not cached at all; users always get fresh responses.
  • Variants created by inspecting other things, such as the User Agent, must be cached separately.
  • Failure to separately cache these variants results in wrong responses. This is bad.

↑ Top ↑

Example Problem #

Imagine you want to use Feedburner to serve your feeds. You want to redirect everyone requesting to load it from instead. You set up your blog to redirect these requests. Then your Feedburner feed breaks because when Feedburner requests your feed, you refuse to give it the data, redirecting it to along with everyone else.

So you make an exception: redirect everyone except Feedburner. You inspect the User-Agent header and if it says “feedburner” you don’t redirect; you give Feedburner the feed. This is what the Feedburner plugin accomplishes. Now your feed works as designed and everyone gets what they are supposed to get. Except sometimes they don’t.

Now you’re running into a caching problem. We cache feeds. More correctly, we cache the responses generated by feed URLs so we can serve them more rapidly for subsequent requests. A redirect is also a response and it happens to be one of our favorite kinds of responses to cache.

The problem is that you have created a variant — the conditional redirect — without telling Batcache about it. Being unaware that your feed URL has several possible responses, Batcache merrily gives everyone whichever response is created for the first request after every cache expiration interval — maybe the redirect, maybe the feed — and it doesn’t care who gets which.

↑ Top ↑

Hidden Mechanism #

You have to tell the cache that you are serving a variant. But it’s not that simple because what you really have to tell the cache is how you knew that you were going to serve that variant. It needs to know this because its job is to receive requests and send correct responses. And it has to do that without loading themes, plugins, or more than about 2% of WordPress. That is why Batcache is so fast.

So if your URL is going to serve variants to anyone who is not logged in, you must inform Batcache how to determine which response goes with which request. That means giving it some logic to execute on its own, without help from your theme or plugins.

↑ Top ↑

Example Problem Solved #

Below is the standard logic to determine whether to serve the feed or the redirect. It is a variant determiner. You just didn’t call it that before.

function is_feedburner() {
    return (bool) preg_match("/feedburner|feedvalidator/i", $_SERVER["HTTP_USER_AGENT"]);

To make this logic available to Batcache, you have to give it a copy.

function is_feedburner() {
    if ( function_exists( 'vary_cache_on_function' ) ) {
            'return (bool) preg_match("/feedburner|feedvalidator/i", $_SERVER["HTTP_USER_AGENT"]);'
    return (bool) preg_match("/feedburner|feedvalidator/i", $_SERVER["HTTP_USER_AGENT"]);

See what we did there? We put the logic in a string and sent it to Batcache. Now the cache knows how to determine which variant goes with which request.

↑ Top ↑

Solution Explained #

The function vary_cache_on_function tells Batcache that the URL currently being served has variants, and how to determine them. It takes one argument, a string that will eventually be passed to create_function. (Click and get acquainted if you are not already. It can be tricky.) This is how your code will be used:

$fun = create_function('', $your_code);
$value = $fun();

This is a lightweight, fast way for Batcache to execute your code without loading your theme and plugins. But it’s not for lightweights. If you are keen on PHP security you might have noticed that any number of very small typos in your code would make it possible for anyone on the internet to execute arbitrary code on your blog, which would be both embarrassing and dangerous. This is equally true of all PHP programming but if you find create_function difficult, please ask for help.

↑ Top ↑

More Mechanism #

Very early in the code execution, long before WordPress has had time to figure out which blog was requested or who requested it, before it can include theme or plugin scripts, Batcache swoops into action and inspects the request. With only the superglobals to guide it ($_SERVER, $_GET, $_COOKIE) the cache searches for a pre-generated response. If it finds your “vary” code, it executes that and uses the result to narrow the search.

When Batcache finds a valid response for the current request, it serves the response and halts execution. That typically takes a few milliseconds which is quite fast. When Batcache finds none, the page must be generated from scratch. Batcache becomes dormant and waits until the response is complete. Then, reborn as an output buffer handler, it swoops back in, runs your code, adds the result to the metadata of the response, caches it, and releases the response to the client who requested it.

↑ Top ↑

Common example #

If you want to show a specific page to a common group of countries, (for example show a cookie notice on all EU countries) the initial solution might seem to be using

wpcom_geo_add_location( 'fr' );
wpcom_geo_add_location( 'uk' );
... etc

But this will actually create a separate cache bucket for each region. greatly diminishing the cache hit % in European countries.

A better way would be to create 2 separate buckets. One for EU and one for the rest of the world like such:

if ( function_exists( 'vary_cache_on_function' ) ) {
    'if ( isset( $_SERVER["GEOIP_COUNTRY_CODE"] ) && in_array( strtolower( $_SERVER["GEOIP_COUNTRY_CODE"] ), ["be", "bg", "cz", "dk", "de", "ee", "ie", "el", "es", "fr", "gb", "hr", "it", "cy", "lv", "lt", "lu", "hu", "mt", "nl", "at", "pl", "pt", "ro", "si", "sk", "fi", "se", "uk"], true ) ) {
        return true;
    } else {
        return false;

↑ Top ↑

Other Problems #

We haven’t thought of all the ways Batcache can break your variants. There are lots of good reasons to create variants: conditional redirects, different markup for different user agents, serving new features to specific IPs for testing, etc. You get the idea. If you plan on varying the response for any page that will likely cached, you should use vary_cache_on_function.

↑ Top ↑

Tips #

  • Your code is subject to the same limitations as Batcache: it can only use core PHP functions and inspect superglobals. While it is technically possible to gain functions by including scripts within your code, this would probably defeat the purpose of keeping the cache as speedy as possible. Avoid this, and certainly don’t do it without consulting a site administrator. They can also tell you how to disable the cache when necessary.
  • Batcache executes your “vary” code only during requests for the exact URL where you called vary_cache_on_function. Please limit your use of that function to URLs that actually have variants. That will help make every page load fast.
  • Get acquainted with create_function. Use single quotes and double quotes appropriately. Test thoroughly before committing or deploying code. If you aren’t sure, ask a site administrator for help.
  • This is worth repeating: when using create_function it is extremely easy to accidentally open the door to arbitrary code execution from the internet. Test, ask others to review your code, and test again.

Ready to get started?

Tell us about your needs

Let us lead the way. We’ll help you select a top tier development partner. We’ll train your developers, operations, infrastructure, and editorial teams. We’ll coarchitect your deployment processes. We will provide live support for peak events. We’ll help your people avoid dark alleys and blind corners, and reduce wasted cycles.