RSS Feeds

phpBB Performance – Reducing the script load overhead

This is the third in a series of phpBB performance articles and in this one I want to investigate how feasible it would be to use code globbing to reduce both the number of include files needed to be parsed and the total size of this source in order to reduce the overall response time for users of a phpBB forum running on a shared service.  This technique is one that I use to good effect on my blog application.  It’s a fairly long article, so I’ve spilt it into sub-sections.

Inclusions – the basics

However, a little review of how applications include sub-modules is probably first in order.  Basically PHP offers six built-in functions to compile source code into an application.  Two, eval() and create_function(), allow developers to compile source in a text file within an application.  However, I want to focus on the four remaining forms that read from PHP source files: require(), require_once(), include() and include_once().  I will refer to these collectively as “included” code.  All four essentially do the same except:

You may have come across a number of posts and articles in the blogshpere discussing the differences in runtime speed impact of these variants, but quite frankly if you look at the source code for these functions, then you will see that you should ignore such claims: there are no material performance differences; only the functional ones that I’ve just summarised.

There is little advantage of using the include forms if you are including code which you need (for example defining functions that the application calls later).  This simply results in the application dying later rather than earlier, in the execution, and IMHO this is usually a bad practice as termination part way through processing a request can leave persistent data in an undefined state.  All or nothing is usually far safer.  Likewise if you are including source anywhere else than at the top level (for example in a class or function), the _once forms are generally preferable unless you prefix the inclusion call by a class_exists() or function_exists() check to avoid repeated compilation.

The PHP scoping rules for includes can be confusing for inexperienced PHP programmers, and can even catch out experiences ones occasionally.  Any included functions and classes are always global.  Also function invocations within the included code can still have global side-effects (a good example is the define function used to declare global constants), as are simple assignments to global constants such as $GLOBALS[‘fred’] = …;  Any other assignments are have a scope determined by the routine that included the code (which under normal PHP scoping rules could then be global if the include was at the top level or the variables were explicitly declared as global in the calling code).  Lastly, if an include contains a return statement, then the compile scan is terminated and the return value passed back as that of the include function.

A good example of how this can be very confusing is if a class method invocation includes code.

So my preference is to keep the content of include files simple and make sure that the include module makes no implicit assumptions about the including module.  The easiest approach here is to contain only items which have global scope:

If you want to use a module to initialise local-to-function data then you can use a return statement and this will be the return value of include call.  But again, I feel it is safer simply to include this data in an initialisation function which you can call immediately after the inclusion.  Anything else is just too dangerous as this creates implicit dependencies between the including and included module, and this can create horrible-to-debug side effects if you want to refactor code.

phpBB’s use of inclusions and it’s inclusions strategy

The phpBB coding guidelines already largely follow my preferred practice.  However, one nice feature of PHP is that it exposes its tokeniser as a callable interface, which enables the development of simple filters to check code structure.  At Listing 1 is an example that scans a source file skipping class and function definitions as other safe constructs leaving the lines to ‘eyeball’ for problems.  At Listing 2 is a filter that generates an analysis of the source.  Both outputs can be “grepped” and manipulated further in a speadsheet.  The advantage of using filters like these is that when you are having to review a fairly large application like phpBB (the base package is 180K lines), then you only need to check a few hundred lines of non-conforming code rather than the whole lot.

global $global_privmsgs_rules, $global_rule_conditions;

I used the filter at listing 2 to investigate any consistency in the layout of the include parameter within individual uses.  Whilst the significant majority (364 inclusions out of 388 across all php files) follow one of three variants, for example such as include($phpbb_root_path . 'includes/some_module.' . $phpEx), that simple pattern replacement could processes automatically, 20 cases do not and removing the root path and extension would involve code changes to the including module and these would each require testing. 

So whilst some more comprehensive code tidy-up to improve code maintainability as part of a 3.1 or 3.2 refactoring, this just isn’t worth the extra regression work for a proof of performance pilot, as it doesn’t materially impact the principle of what I am trying to demonstrate here or the run times.  A good example here is includes/acp/acp_language.php which implements the ACP functions for language packs if you want to drill into these yourself.

My pilot implementation of code globbing

The key to my proposed dynamic load strategy is to replace all include occurrences with a new function phpbb_load_module(), which is itself declared in the common.php module.  These can take up to three arguments:

phpbb_load_module does the following

On initialisation in common.php, if the globbed cache-file exists then it is included, and doing so this will both compile in all included code within it and initialise the corresponding return status array entries in a single file access, so that any subsequent phpbb_load_module calls for the same module will bypass the inclusion and return the status from the array.  If it doesn’t exist then the global prime-cache flag is set to force the phpbb_load_module calls to build the cache-file.  A hook into the exit handler rounds this process off.

Hence for example, on the first invocation of viewforum, say, the code follows the normal code path including the required modules and building up the glob; on subsequent requests for viewforum, the code glob will be loaded and the following phpbb_load_module calls for pre-loaded modules will return the required value without doing any further file I/O.  Now this doesn’t preclude a code path which requires further modules which will generate more include file accesses but at least the core set are loaded in one go,

In order to keep the change impact as small as possible:

$include_result = phpbb_load_module($language_filename);
$this->lang = array_merge( $this->lang, $include_result['_RESULT_LANG'] );

So how does this version perform?

I reran a system trace of this new version a few times and got very consistent results.  The output of one is shown at Table 1 (and which you might compare with the original Table 1 in my earlier article.)  Instead of loading 18 application script files and 14 data cache files, I am now loading 6 script files, 1 globbed script file and one data cache file.  The runtime on this fully cached case is still roughly 10% shorter, despite the additional processing overhead of reading the two compressed files.  By hoisting the load of each phpbb_load_module.php into each top level request script, I could also cache the common.php.  In principle, I could in also include the templates in the globbed script file, but the templating engine is a bit of a mess and could do with a rewrite.  This would not only shorten the generated templates but would also allow me to include them in the cache.  A job for another day.

This 10% figure is with files fully cached.  The real saving is on that first request when they aren’t and this new version will typically save perhaps 25-50 physical I/Os in this case.  Given that most LAMP servers end up disk I/O bound, this avoids significant aggregated I/O queuing delays as well as reducing the load on the server. I am collecting some real performance stats on a test instance on my Webfusion service and this fix combined with the data cache discussed in my previous article approximately half the overall response time on first request.  The set of mods is basically working fine for the main path code, but I am triggering some of these “side-effect” artefacts which cause some gremlins that I still need to shake down, before I could regard this as anything more than alpha code suitable for proof of principle.

I have one last performance area that I want to look at, and that is on my LAMP platform, as is the case on many such offerings, the MySQL engine is on a different server.  This in turn means that the SQL calls are effectively RPCs (remote procedure calls) across the internal network infrastructure in the data centre.  I just want to switch on SQL logging on my test rig to see if it would be worthwhile making any optimisations in this area, but this analysis is also the subject of a different article.

Table 1 – Summary of Strace on viewforum.php&f=2

Time (mS)    System Call
  0  execve("/usr/bin/php", ["php", "-r", "$_GET[\"f\"]=\"2\"; include( \"viewfo"...], [/* 17 vars */]) = 0
 94  open("/var/www/forum/viewforum.php", O_RDONLY) = 3 
 97  open("/var/www/forum/common.php", O_RDONLY) = 3    
 99  open("/var/www/forum/includes/phpbb_load_module.php", O_RDONLY) = 3        
101  open("/var/www/forum/cache/viewforum.php.gz", O_RDONLY) = 3        
144    open("/var/www/forum/cache/cache_data.gz", O_RDONLY) = 3 
150  open("/usr/share/mysql/charsets/Index.xml", O_RDONLY) = 4  
162  open("/var/www/forum/cache/tpl_prosilver_message_body.html.php", O_RDONLY) = 4     
163  open("/var/www/forum/cache/tpl_prosilver_overall_header.html.php", O_RDONLY) = 4   
165  open("/var/www/forum/cache/tpl_prosilver_overall_footer.html.php", O_RDONLY) = 4   
180  exit_group(0)           = ?        

Listing 1 – Example of using the PHP token parser to validate coding

 cd /var/www/forum/includes;
 rm /tmp/yy.log; for f in $(find * -name \*.php); do  sudo php ~/work/forum/checkTopLevelAssigns.php $f >> /tmp/yy.log; done; 
 vi /tmp/yy.log
$src = file_get_contents( $argv[1] );
$tokens = token_get_all($src);
$output = "";
define('T_DEFINE', 999);
// These are in ascending numeric order so the same order on the echo below 
                 T_RETURN=>0, T_CLASS=>0, T_DEFINE=>0);
reset( $tokens);
while( list($i, $token) = each( $tokens ) ) {
    if(is_string( $token ) ) {
        // simple 1-character token
        $output .= $token;
    } else {
    // token array
        list( $id, $text ) = $token;
        switch( $id ) { 
            case T_FUNCTION: case T_CLASS:
                while( list($i, $token) = each( $tokens ) && (is_array( $token ) || $token != '{' ) ) {}
                $nParen = 1;
                while( $nParen > 0 && ( list($i, $token) = each( $tokens ) ) ) {
                    if( !is_array( $token ) ) continue;
                    $nParen += ( $token = '{' ) ? 1 : ( ( $token = '}' ) ? -1 : 0 );
            case T_STRING:
                $j = $i + ( (is_array($tokens[$i+1]) && ( $tokens[$i+1][0] == T_WHITESPACE )) ?  2 : 1 );
                if( ( $text == 'define' ) && !is_array( $tokens[$j] ) && $tokens[$j] == '(' ) { 
                    $id = T_DEFINE; // *** and fall through
                } else {
                    $output .= $text;   // any other strings -> output "as is"
            case T_INCLUDE: case T_INCLUDE_ONCE: case T_REQUIRE: case T_REQUIRE_ONCE: case T_DEFINE: case T_RETURN:
                while( (list($i, $token) = each( $tokens )) && (is_array( $token ) || $token != ';' ) ) {}
            case T_COMMENT: case T_DOC_COMMENT: case T_OPEN_TAG: case T_CLOSE_TAG:
            case T_WHITESPACE:
                $output .=  ( ( strpos( $text, "\n" ) === false ) ? ' ' : "\n" );
                $output .= $text;   // anything else -> output "as is"
echo "** $argv[1]\t". implode( "\t", $modType ) . "\n";
$from = array( "/\s*\n[\s\n]*/s", "/\s{2,}/s", "/^\s+/m", "/global\s.*/m", 
               "/if \(!defined\('IN_PHPBB'\)\)\n\{\nexit;\n\}\n/s", "/^\$GLOBALS\[.*/m" ); 
$to = array( "\n", " ", "", "", "", "", "");
echo preg_replace( $from, $to, $output ), "\n";

Listing 2 – Using the PHP token parser to analyse include patterns

$file = $argv[1];
$src = file_get_contents( $argv[1] );
$lines = explode( "\n", $src );
$tokens = token_get_all($src);
reset( $tokens);
while( list($i, $token) = each( $tokens ) ) {
    if(!is_string( $token ) ) {
        list( $id, $text, $lineNo ) = $token;
        switch( $id ) { 
            case T_INCLUDE: case T_INCLUDE_ONCE: case T_REQUIRE: case T_REQUIRE_ONCE:
                $type = token_name( $id );
                $arg = "";
                while( (list($i, $token) = each( $tokens )) && (is_array( $token ) || $token != ';' ) ) {
                    if( is_array( $token ) && $token[0] == T_WHITESPACE ) continue;
                    $arg .= is_array( $token ) ? $token[1] : $token;
                if( $arg[0] == '(' ) $arg = substr( $arg, 1, -1 );
                $std1 = preg_match( '/^ \\$ phpbb_root_path \. .*? \. \\$ phpEx $ /x', $arg);
                $std2 = preg_match( '/^ "\{ \\$  phpbb_root_path \} .*? \. \\$ phpEx "$ /x', $arg);
                $std3 = preg_match( '/^ " \\$ phpbb_root_path  .*? \. \\$ phpEx "$ /x', $arg);
                $std  = $std1 + $std2 + $std3;
                // Space-compress source line so that it can be loaded into a TSV file
                $srcLine = preg_replace(array('/^\s+/m','/\s+/'),array('',' '),$lines[$lineNo-1]);
                // Output as TSV file for loading into a spreadsheet
                echo "$file\t$lineNo\t$std\t$std1\t$std2\t$std3\t$text\t$arg\t$srcLine\n";

Post a comment

Please note that your name is required and that all posts will not be visible until authorised by an administrator.

  A valid mail address must be supplied
A cookie will store your name/url for three months
 Sorry, but you must answer this easy sum as a SPAM prevention measure.

You should be aware that all information on blog site is © Terry Ellison 2010 and made open access under the Creative Commons Artistic Licence.

Your comments will only publicly available after you have carried out email confirmation. Your email address will only used for this purpose and is not made public.

Comments that are not confirmed will be automatically deleted after 7 days.

The blog author reserves the right to delete comments which breach copyright or the rules of site etiquette as he determines (such as unnecessary use of obscenity or spam content).