How to: Cache static HTML with CloudFlare?
So… I have been working with CloudFlare for quite sometime but every now and then when my web server goes down CloudFlare seems to be unable to serve my pages while it comes back. I had to create my own cache server for it but it involved another point of failure and it slowed down the response times. I am sure at one point CloudFlare with it’s Always Online feature took care of my site… but now I am not sure. So the question came to my mind… how do I cache my web server’s responses that are HTML. I do run a WordPress blog so saying I am caching static HTML could technically be incorrect but 90%+ of the content once published does not change. Surely there could be some false/positives (if I may call them that) like mobile and desktop clients get the same flavor of the site, or incorrect behavior when publishing comments, etc. But based on the site’s usage, mobile is less than 10% of the visitors and comments are rare, so more users are impacted by our web server going offline than by getting a desktop theme on a mobile device. So the lesser of two evils, we had to go with CloudFlare caching. The question then became… why is it not caching html?
What we found out
We needed to understand CloudFlare’s caching offerings to figure out how to achieve what we wanted (if possible) and we came up with this information:
What are CloudFlare’s caching levels?
You can set CloudFlare’s CDN to cache static content according to these levels:
- No Query String / Basic: Only delivers resources from cache when there is no query string.
- Ignore Query String / Simple: Delivers the same resource to everyone independent of the query string (note: this will also remove the query string from the request to your origin).
- Standard / Aggressive: Delivers a different resource each time the query string changes.
Note: CloudFlare, by default, does not cache HTML content. You need to write a Page Rule to cache static HTML content.
They key part here was “CloudFlare, by default, does not cache HTML content.” So I asked myself… how does Always Online work then? Looking at the site I could not figure it out:
Always Online is a feature that caches a static version of your pages in case your server goes offline.
Our user agent
Mozilla/5.0 (compatible; CloudFlare-AlwaysOnline/1.0; +https://www.cloudflare.com/always-online) AppleWebKit/534.34
Why we’re crawling
If your server ever goes offline, CloudFlare will serve a limited copy of your cached website to keep it online for your visitors. CloudFlare builds the Always Online version of your website, so your most popular pages are represented. CloudFlare is caching pages when you see the crawler in your logs.
We crawl free customers once every 7 days, Pro customers once every 3 days, and Business and Enterprise customers daily. More Details…
sounds like it should cache HTML otherwise how could it keep my site offline every time my server goes down? But somehow it doesn’t and that was really bugging me. Thankfully, the previous information we recovered (Note: CloudFlare, by default, does not cache HTML content. You need to write a Page Rule to cache static HTML content.) gave us an answer: Page Rules.
How to: Cache static HTML content using Page Rules
Well, you can cache static and non-static HTML content… the problem is that non-static content will be a snapshot in time so when it changes because it is cached the site visitors will get always the same page. So it is something not recommended. So let’s see how CloudFlare suggests we do this:
How do I cache static HTML?
- Log into your CloudFlare account.
- From the dropdown menu on the top left, select your domain.
- Click the Page Rules app in the top menu.
- The first step is creating a pattern and then applying a rule to that pattern. You’ll need to find or create a way to differentiate static versus dynamic content by the URL. Some possibilities could be creating a directory for static content, appending a unique file extension to static pages, or adding a query parameter to mark content as static. Here are three examples of patterns you could create for each of those options:*example.com/static/* [/static/ subdirectory for static HTML pages] *example.com/*.shtml [.shtml file extension to signify HTML that is static] *example.com/*?*static=true* [adding static=true query parameter]
You’ll want to design the pattern to only describe pages you know are static.
- Click Cache everything in the Custom caching dropdown menu.
- Click Add rule.
If you see the HTML is not being cached, despite the cache everything rule, it means you need to override the origin cache directive with an “Edge Cache TTL” setting.
If the Cache-Control header is set to “private”, “no-store”, “no-cache”, or “max-age=0”, or if there is a cookie in the response, then CloudFlare will not cache the resource, unless a Page Rule is set to cache everything and an Edge Cache TTL is set.
So, if you are lazy and your site permits it, just example.com/ the whole thing to get it cached. Keep the Cache-Control header set to no-cache when you are doing a dynamic page and make sure the Edge Cache TTL is not set… and you’re golden!
Another idea is to again, cache everything, but then with another rule lower the cache level on the parts of the site that are dynamic (the opposite approach.) So as you can see, you can get creative with it and figure what approach works best for you.