28 September 2011

Drupal bootstrap early page cache


The previous article in this series focused on phase 1 of the Drupal bootstrap process (DRUPAL_BOOTSTRAP_CONFIGURATION). This article will now focus on the second phase - DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE.

The early page cache phase is one that might not accomplish much for some Drupal sites, but for some others can play an important role in improving a site's overall performance by minimizing latency when delivering content to a user's browser. This is accomplished by caching already rendered pages for anonymous users (i.e. haven't logged in). This phase is run early in the bootstrap process as it doesn't rely on any later bootstrap phases (used to gather content and render pages) and its purpose is to keep latency to a minimum - in other words the sooner it executes, the faster a user receives the requested content.

There are two forms of caching with Drupal: non-database and database. The early page cache relies on non-database mechanisms, such as file system and in-memory caching. Memcached (http://memcached.org) is one example of an in-memory cache. It stores data - in our case rendered pages - using key-value pairs in a hash table kept in memory. I've not had the opportunity to use this type of caching mechanism with Drupal, and for some this isn't even an option. The reason for this is that shared hosting sites would need to have installed Memcached servers and required PHP extension in their shared environment. For security reasons this is not usually done, and thus this option is all but limited to dedicated hosts or virtual private servers (VPS).

For my analysis, I ended trying a file system-based mechanism called faspath_fscache. This Drupal modules uses the server's file system to cache rendered pages. Using the file system can be faster than a database - especially on a very loaded server in a shared environment.

In order to use this module (http://drupal.org/project/fastpath_fscache), you need to download and install it as you would with any other module. (Note* You need to manually add configuration parameters in the settings.php file, since we cannot rely on the database to store configuration as database is only bootstrapped later. It would be nice to use the drupal_rewrite_settings() function in install.inc instead of relying on users to edit the file by hand, however.) EDIT: Unfortunately the drupal_rewrite_settings() function is limited to constants and string type variables, and is really only intended to be used during the initial setup of Drupal. So the only option is to modify the file by hand.

The configuration parameters set the page_cache_fastpath flag, specify the path location of the module's implementation of the caching interface (e.g. cache_set() and cache_get() functions), and specify a path for the file cache.

With the module installed and configured, the DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE will now be able to perform some meaningful function. It will first begin by including the cache implementation file (as specified earlier in settings.php). It will then test to see if page_cache_fastpath flag has been set and if so test the result returned by page_cache_fastpath() function (implemented by the module). If both are true, the page is served to the user; otherwise, bootstrapping continues to later phases. The page_cache_fastpath() function simply checks to see that a form was not submitted or that the user was not logged in. In either case, a cached page should not be served. It will then check to see if the page being requested is currently stored in the file system, and will serve it if present.

In my limited time using early page cache, I did see a noticeable improvement (70%) in page loading time when compared against non-cached. I ran tests using Proxy Sniffer (http://www.proxy-sniffer.com/) on my local development environment using faspath_fscache and page load dropped from 3.5 sec average to approximately 1 sec. That may not seem like much to some, but in this day and age information delivery is paramount and users don't like to wait longer than they have to in order to see the content they've requested. I also ran some tests using Drupal's out-of-box database caching mechanism, and it actually performed slightly better that the file system caching. This test was also done on my development environment, but it would be interesting to test on a larger system that is hosting several sites to see if the results are comparable. In any event, if your site's performance is important you should definitely consider some form of caching to improve page loading time for your users. Until next time, keep IT simple.

The next article will focus on the DRUPAL_BOOTSTRAP_DATABASE phase. Stay tuned!

No comments:

Post a Comment