It is roughly said that the two hardest problems in computer science are caching, naming things and off-by-1 errors. This amusing saying also extends to programming and web development. In this blog post, I will begin by touching upon the first problem above, caching, as related to web browsing in general, and then I will give some examples of how website owners can utilize caching to their advantage.
Caching, as related to computers and the web, is a classical speed vs. size tradeoff — the more data that you can hold locally, the faster you can access that data. Caching is commonly done by most web browsers to eliminate any need for repeated data transfers/downloads. For example, if the browser has the latest copy of an image from a website, it shouldn’t need to download it again, as it will take more time to reacquire the image instead of just using the local copy.
If the above explanation isn’t very clear, there is a decent real world example that you can visualize: the refrigerator. A refrigerator is a cache for food. It would take you longer to go to a grocery store and pick up some milk, than if it were already in your fridge ready for you to use on demand.
Even though computers on the Internet interact/communicate mind-numbingly fast, where delays are typically measured in milliseconds, data transfer over vast interconnected computer networks (e.g. the Internet) does indeed take measurable time and those delays can add up to a sluggish web browsing experience, just like unexpectedly driving to the grocery store can delay dinner time.
As with the refrigerator example above, your browser/computer isn’t the only cache to consider. The grocery store is also a cache, a bigger cache, and they get their food from other sources located at even greater distances. If the store is out of milk, you would have to wait for milk to arrive, or try a different store – perhaps further away in a different town.
Similar to waiting for milk to arrive at the store, some times web servers take an excessive amount of time to compile all the data that is needed to deliver the web page that you are trying to access. This delay is typically caused by an overloaded server – a server that is requested to serve more page views than its ability to meet that demand. Chances are that you are waiting for the milk to arrive at the store because the demand for milk has increased, while the supply has not, causing the store to be out of milk when you wanted it.
While a grocery store can simply do inventory management, and stock more milk based on demand trends, a web server is typically fixed in the amount of space it has and data it can process.
How can my website serve more viewers?
However, if you have a website, and you find that you have excess space (disk storage) available to you, what can you do to help satisfy the demand of your viewers? Well, start caching!
If your site is image/video heavy, a Content Delivery Network (CDN) can help distribute the load. The CDN servers contain a local cache of your content, and the closest/fastest server to the viewer is the one that sends the data. While a CDN is basically load balancing, the fundamental concept of caching still applies.
If your site is static-content heavy, for example a WordPress based blog with many posts and many viewers, page level caching can help speed things up. Plugins such as W3 Total Cache (W3TC) and WP Super Cache are pretty popular/notable WordPress plugins. There are also content management systems with layered caching built right in, such as MODX.
If you have more control over your web server environment, using code caching (such as APC for PHP) and database caching (such as Memcached) can also help increase the amount of traffic your dynamic site can handle.
When shouldn’t you use caching?
Well, if you are trying to host a non-trivial website (anything more than a 5-10 page static brochure style website) on a cheap “$1.99/month” type hosting package, chances are all the caching in the world is not going to help, and might actually hurt your site/account by eating up more of those “unlimited” resources than the hosting provider anticipates/allows.