Categories

Archives

07 30th, 2008

PHP5 Autoloading on Steroids

Author: tim.ariyeh

For an overview of autoloading in general, please see my previous article

Much fuss is made over the performance penalties of using PHP’s magic autoloading functionality. While I believe that it reduces overall project complexity and increases productivity, there are some very high-traffic projects I’ve overseen where the performance overhead becomes noticeable. This is particularly true for page requests that are already sensitive to delays, such as AJAX or REST requests. In this article, I’d like to showcase a couple of options for mitigating (and even eliminating) autoloader performance penalties.

Please note that all code examples in this article are strictly for illustrative purposes, and are not at all ready for production sites. These concepts should serve as starting points to implement your own high performance autoloaders

The Problem

While autoload is a godsend for managing large projects, you do incur a small performance penalty for its use. This overhead occurs primarily because:

  • Most autoloader methods search through the include path
  • Most autoloader methods include files using relative paths
  • PHP must devote execution time to Just In Time load each class. This magic comes at a price.

If this process occurs a few dozen times per request, and requests are popping in around 10/second, you might find yourself in a boring meeting discussing how the hell your team can decrease latency.

Step 1: Include Patch Caching

By and large, the content of include directories for production projects does not change. If a class file was found during the last request, why search for it again during the current one?

Rather than simply calling include() (or require() ) from your autoloading methods, do a little digging to discover the absolute path of the file for a particular class. This will be an expensive operation, but it need only be performed once. Each subsequent request can simply result in the direct inclusion of the correct file, rather than a hunt through the include_path. An example:

  1. class CodeFeast_Loader
  2. {
  3.   protected static $include_paths = array();
  4.   protected static $include_paths_loaded = false;
  5.   protected static $file_location_cache = array();
  6.   public static function autoload($class)
  7.   {
  8.    //check if this class is cached
  9.    if(isset(self::$file_location_cache[$class]) )
  10.    {
  11.    //It was cached.  No need to search, just load
  12.     require(self::$file_location_cache[$class]);
  13.    }
  14.    //It wasn’t cached.  We’ve got to hunt
  15.    else
  16.    {
  17.     //did we nab the include_path yet?
  18.     if(!$include_paths_loaded)
  19.     {
  20.      self::populateIncludePaths();
  21.     }
  22.     //Look in each path for the class file
  23.     foreach(self::$include_paths as $path)
  24.     {
  25.      if(file_exists($path . "$class.php") )
  26.      {
  27.       //We found our file.  Add it to the cache
  28.       self::$file_location_cache[$class] = $path . "$class.php";
  29.       //And load it
  30.       require(self::$file_location_cache[$class]);
  31.      }
  32.     }
  33.    }
  34.   }
  35.  //grab the include_path from php.ini
  36.   public static function populateIncludePaths()
  37.   {
  38.    self::$include_paths = explode(PATH_SEPARATOR,
  39.    ini_get(‘include_path’);
  40.    self::$include_paths_loaded = true;
  41.   }
  42.   //load the cache from ‘tmp’
  43.   public static function openCache()
  44.   {
  45.    $file_location_cache =
  46.     file_get_contents(‘tmp/file_location_cache.bin’);
  47.    self::$file_location_cache =
  48.     unserialize($file_location_cache);
  49.  
  50.   }
  51.   //persist the cache to dir ‘tmp’
  52.   public static function saveCache()
  53.   {
  54.    $file_location_cache =
  55.     serialize(self::$file_location_cache);
  56.    file_put_contents(‘tmp/file_location_cache.bin’,
  57.     $file_location_cache);
  58.  
  59.   }
  60.   //init the autoloader
  61.   public static function start()
  62.   {
  63.    //Register the autoloader
  64.    spl_autoload_register(
  65.     array(‘CodeFeast_Loader’, ‘autoload’)
  66.    );
  67.  
  68.    //Restore the cache
  69.    self::openCache();
  70.   }
  71.  }
  72.  
  73. CodeFeast_Loader::start();
  74.  
  75. //Code for your application goes here
  76.  
  77. //Call the saveCache() method to persist the cache across requests
  78. CodeFeast_Loader::saveCache();

Using this autoloader class, the actual search for the file is only performed the first time a file is requested. Furthermore, performance is enhanced by the calling of require() rather than require_once(), and by calling it with the absolute path to the file requested.

Step 2: Anticipatory Autoloading

So, you’ve implemented path caching in your autoloaders, but your users are still complaining that their annoying popups aren’t loading fast enough? Let’s figure out how to keep the convenience, but completely eliminate calls to autoload.

Much like the include_path, the specific classes loaded for each section of your application are unlikely to change from request to request. If we can find a unique identifier that will partition our apps into segments (such as the URL), we can simply have our autoloader remember what files it had to load from the last request, and skip the autoloader overhead all together.

Let’s start with our previous example, and add some enhancements:

  1. class CodeFeast_Loader
  2. {
  3.   protected static $include_paths = array();
  4.   protected static $include_paths_loaded = false;
  5.   protected static $file_location_cache = array();
  6.   protected static $app_segment_cache = array();
  7.   protected static $app_segment = ;
  8.   public static function autoload($class)
  9.   {
  10.    //check if this class is cached
  11.    if(isset(self::$file_location_cache) )
  12.    {
  13.    //It was cached.  No need to search, just load
  14.     require(self::$file_location_cache);
  15.    }
  16.    //It wasn’t cached.  We’ve got to hunt
  17.    else
  18.    {
  19.     //did we nab the include_path yet?
  20.     if(!$include_paths_loaded)
  21.     {
  22.      self::populateIncludePaths();
  23.     }
  24.     //Look in each path for the class file
  25.     foreach(self::$include_paths as $path)
  26.     {
  27.      if(file_exists($path . "$class.php") )
  28.      {
  29.       //We found our file.  Add it to the cache
  30.       self::$file_location_cache[$class] = $path . "$class.php";
  31.       //And load it
  32.       require(self::$file_location_cache[$class]);
  33.       //And add it to the segment cache
  34.       self::$app_segment_cache[self::$app_segment] [] =
  35.       self::$file_location_cache[$class];
  36.      }
  37.     }
  38.    }
  39.   }
  40.  //grab the include_path from php.ini
  41.   public static function populateIncludePaths()
  42.   {
  43.    self::$include_paths = explode(PATH_SEPARATOR,
  44.    ini_get(‘include_path’);
  45.    self::$include_paths_loaded = true;
  46.   }
  47.   //load the cache from ‘tmp’
  48.   public static function openCache()
  49.   {
  50.    $file_location_cache =
  51.     file_get_contents(‘tmp/file_location_cache.bin’);
  52.    self::$file_location_cache =
  53.     unserialize($file_location_cache);
  54.  
  55.    $app_segment_cache =
  56.     file_get_contents(‘tmp/app_segment_cache.bin’);
  57.    self::$app_segment_cache =
  58.     unserialze($app_segment_cache);
  59.   }
  60.   //persist the cache to dir ‘tmp’
  61.   public static function saveCache()
  62.   {
  63.    $file_location_cache =
  64.     serialize(self::$file_location_cache);
  65.    file_put_contents(‘tmp/file_location_cache.bin’,
  66.     $file_location_cache);
  67.  
  68.    $app_segment_cache =
  69.     serialize(self::$app_segment_cache);
  70.    file_put_contents(‘tmp/app_segment_cache.bin’);
  71.   }
  72.   //init the autoloader
  73.   //We add a method to identify which part of the app we’re in
  74.   //This could be a URL, or a controller action in MVC
  75.   public static function start($app_segment)
  76.   {
  77.    self::$app_segment = $app_segment;
  78.    //Register the autoloader
  79.    spl_autoload_register(
  80.     array(‘CodeFeast_Loader’, ‘autoload’)
  81.    );
  82.  
  83.    //Restore the cache
  84.    self::openCache();
  85.  
  86.    //Load every cached autoload for this segment as a simple require
  87.    if(is_array(self::$app_segment_cache[$app_segment]) )
  88.    {
  89.     foreach(self::$app_segment_cache[$app_segment] as $include)
  90.     {
  91.      require($include);
  92.     }
  93.    }
  94.   }
  95.  }
  96.  
  97. //We’ll start the autoloader using the URL as the app segment
  98. CodeFeast_Loader::start($_SERVER[‘REQUEST_URI’]);
  99.  
  100. //Code for your application goes here
  101.  
  102. //Call the saveCache() method to persist the cache across requests
  103. CodeFeast_Loader::saveCache();

And now we’ve rid ourselves of autoload’s performance overhead without sacrificing any of its convenience.

This class has little effect on the initial, uncached request to an app segment, but really shines on subsequent requests.

Since this class remembers what it autoloaded, it can eliminate itself all together and bring what it will need in as simple includes. Best of all, it still has autoload functionality in the unlikely event that the required classes change between requests.

Since we’ve chosen to segment our app by URL, I’ll use “index.php” as an example. You could also easily segment by module, controller, or action.
When the first request is made for the location “index.php”:

  1. The autoloader is woken up and restores its cache files
  2. Every class that is needed for index.php will be autoloaded as usual
  3. The autoloader will remember everything it had to autoload for “index.php”

Now, when the second request is made for this same location, this happens:

  1. The autoloader is woken up and restores its cache files
  2. Every class that it had to autoload last time is immediately included
  3. The people rejoice

Security Note

Do not store the autoloader cache files in /tmp, or any other world writable area. The last thing any of us needs is some little weenus having the ability to arbitrarily load include files on our server. Also, you really should perform some sanity checks on the files you’re including in your autoloaders.

Conclusion

I hope these crude classes effectively illustrate that you needn’t suffer autoloader overhead with every request. I should also note that I don’t advocate the use of steroids. Sure, you look great on the beach, but they shrink your giblets.

07 14th, 2008

__autoload Sucks

Author: tim.ariyeh

It Seemed Like a Good Idea at the Time

I’ve recently noticed a few articles advocating the use of PHP’s __autoload function to manage project includes. I’m sure these authors have good intentions, but the __autoload functionality became outdated before anyone really even started using it.

The __autoload function was introduced way back in PHP5 as a means to dynamically load class files as you need them. It was definitely a step in the right direction, and using it was a good design choice for all but the smallest object-oriented projects.

In case you missed it, __autoload replaced the need to call individual includes for each external class file. For instance, your only option for loading a file “KrunkTastic.php” which contains the “KrunkTastic” class definition used to look like this:

  1. include(‘classes/KrunkTastic.php’);
  2.  
  3. $obj = new KrunkTastic();

Clearly, this could get to be a big pain as your project grew. You were then faced with loading every class you’d ever need, or remembering to load them yourself every time you needed to get your krunk on. Both of these solutions sucked. Luckily, the PHP developers agreed, and gave birth to a good first draft solution.

Once a function called __autoload is defined, PHP will call it when it comes across a class definition it doesn’t know about. This gives you a chance to load the class file before PHP tries to access it. Consider the previous example:

  1. //This function can be defined anywhere
  2. function __autoload($className)
  3. {
  4.    include("classes/$className.php");
  5. }
  6. $obj = new KrunkTastic();

When the PHP5 parser gets to the line that instantiates a new object of type “KrunkTastic”, it realizes that no class by that name has been declared. It does, however, notice that we’ve defined a function called “__autoload”. PHP then calls our __autoload function, passing the name of the unknown class as our “$className” parameter. This gives us one last change to include the proper class file. That’s right, PHP4, you suck.

Abomination!

This all seems pretty slick, but within this architecture lurks a deep and insidious evil. Since __autoload is declared as a global function with a fixed name, it can only be declared once. This severely limits the developers of third-party toolkits and libraries in their use of this functionality. If you integrate libraries into your application that have already declared an __autoload function, you can’t re-declare it for your own code.

This means that, while it’s a perfectly adequate solution for smaller projects, __autoload is pretty crappy for the big projects that need it the most. The PHP team, who would not stand for such a bummer, then crafted a much more workable solution.

Enter spl_autoload

PHP 5.1.2 introduced the spl_autoload family of functions. This gave developers the ability to switch from a single, global __autoload function to a global autoload stack of functions, registered using spl_autoload_register. An example:

  1. //Register an autoload function
  2. function myAutoLoader($className)
  3. {
  4.    include("classes/$className.php");
  5. }
  6. spl_autoload_register(‘myAutoLoader’);
  7. //Register another autoload function
  8. function myOtherAutoLoader($className)
  9. {
  10.    include("include/$className.php");
  11. }
  12. spl_autoload_register(‘myOtherAutoLoader’);
  13. //Register a static method from a class as an autoloader
  14. class AutoLoader
  15. {
  16.         public static function load($className)
  17.         {
  18.            include("classes/$className.php");
  19.         }
  20. }
  21. spl_autoload_register( array(‘AutoLoader’, ‘load’) );
  22. $obj = new KrunkTastic();

It should be noted that as soon as you make a call to spl_autoload_register, any legacy __autoload function will be ignored. That’s what it deserves for being old and busted. If you really want to use it, you have to manually add it to the spl_autoloader stack with a call to spl_autoload_register(’__autoload’).

Conclusion

Using legacy autoloading via __autoload has been obsolesced by superior autoloader stacking, which offers much more flexibility, and interoperability with third-party libraries. Go forth and prosper, and stop loading your class files manually like a masochist. Unless you’re a masochist, in which case I respect your lifestyle choice.