Troubleshooting Caching Problems
Last week I ran into a strange issue on my blog, which I've recently re-written in ASP.NET Core. I'd published a new post but when I visited the blog homepage at markheath.net the post wasn't visible. Now I use CloudFlare which is a great way to get SSL for free as well as boosting traffic speed. But my actual site is hosted on Azure Web Apps, so I tested bypassing CloudFlare and directly visiting my site at soundcode.azurewebsites.net
and sure enough when I visited that address the new blog post was visible.
So the evidence seemed to point at CloudFlare serving up an old cached version of my homepage. CloudFlare offers a configurable caching service which can help boost the speed of your site. But purging the cache and disabling the cache didn't fix the issue. Eventually I had to consider that something other than CloudFlare was caching the page. This is actually a very important principle in software debugging - be willing challenge your assumptions.
I'd already ruled out it being my local browser cache by seeing the same results fetching the page with curl. So it seemed that my new ASP.NET Core website was serving a different homepage depending on whether I visited via markheath.net
or soundcode.azurewebsites.net
. How on earth could that be happening?
Well, I'd based the implementation of my blog on Mads Kristensen's MiniBlog.Core, a superb open source .NET Core blogging engine, which simply stores its posts in XML files rather than needing a database. I'd used a similar approach except with Markdown files.
One thing I'd copied from MiniBlog.Core was the use of the WebEssentials.AspNetCore.OutputCaching NuGet package, also written by Mads. After diving into the code, it dawned on me what the problem might be.
First of all the key for the cache included the host name, so it was possible for markheath.net
and soundcode.azurewebsites.net
to cache different versions of the same page. That's not a big deal on its own, but the real problem was with the cache expiration options:
var options = new MemoryCacheEntryOptions();
options.SetSlidingExpiration(TimeSpan.FromSeconds(profile.Duration));
The SetSlidingExperation
configuration for MemoryCache
means that every time an entry is retrieved from the cache, the timer is reset and it will remain in the cache for the duration specified. The default being used for caching pages was 1 hour. That meant that in a 1 hour period if anyone visited my home page, it would get cached for another whole hour. So if just one or two people visit my site every hour, it could go days before the home page exits the cache.
There are two ways to fix this. First is not to use sliding expiration but absolute expiration. This means that after the hour is up, we will always go to disk to get a fresh copy.
The second is to use some other mechanism to invalidate the cache. And the WebEssentials.AspNetCore.OutputCaching
project did actually have a mechanism in place. It used the AddExpirationToken
method in conjunction with a file watcher to invalidate the cache.
foreach (string globs in profile.FileDependencies)
{
options.AddExpirationToken(env.ContentRootFileProvider.Watch(globs));
}
So if any file in my folder of Markdown blog posts changed, the cache would have been invalidated. So why didn't it work in this instance? Well, the issue was that I had scheduled this particular post for the future. When the file on disk changed, the cache was invalidated and the home page without the new post was cached with a sliding expiration. The small but steady stream of visitors kept that old version of the home page in the cache, and after the post's go live date, nothing on the disk was changing and so the cache wasn't getting invalidated.
This problem goes to prove the saying that there are only two hard things in computer science - cache invalidation and naming things. By sheer coincidence, just a few days later, I ran into another similar problem with the use of a sliding cache expiration in a different project.
I've raised an issue on the WebEssentials.AspNetCore.OutputCaching
project to start a conversation on how best to work around this.
My key takeaways from troubleshooting this issue are:
- Challenge your assumptions - The problem is not always in the piece of code you think it is.
- Think carefully about cache invalidation - When done right, caching can bring incredible performance benefits. When done wrong, it can break your entire application.
I should also mention that my friend Elton has a Pluralsight course on Caching in .NET which is well worth checking out if you want to dive deeper into this topic.