This is the third in a series of phpBB performance articles and in this one I want to investigate how feasible it would be to use code globbing to reduce both the number of include files needed to be parsed and the total size of this source in order to reduce the overall response time for users of a phpBB forum running on a shared service. This technique is one that I use to good effect on my blog application. It’s a fairly long article, so I’ve spilt it into sub-sections.
Inclusions – the basics
However, a little review of how applications include sub-modules is probably first in order. Basically PHP offers six built-in functions to compile source code into an application. Two, eval() and create_function(), allow developers to compile source in a text file within an application. However, I want to focus on the four remaining forms that read from PHP source files: require(), require_once(), include() and include_once(). I will refer to these collectively as “included” code. All four essentially do the same except:
- the require forms raise a fatal error rather than a warning if a compile error occurs; and
- with the _once forms, PHP maintains an internal array of all modules loaded, and uses this to bypass any repeated load.
You may have come across a number of posts and articles in the blogshpere discussing the differences in runtime speed impact of these variants, but quite frankly if you look at the source code for these functions, then you will see that you should ignore such claims: there are no material performance differences; only the functional ones that I’ve just summarised.
There is little advantage of using the include forms if you are including code which you need (for example defining functions that the application calls later). This simply results in the application dying later rather than earlier, in the execution, and IMHO this is usually a bad practice as termination part way through processing a request can leave persistent data in an undefined state. All or nothing is usually far safer. Likewise if you are including source anywhere else than at the top level (for example in a class or function), the _once forms are generally preferable unless you prefix the inclusion call by a class_exists() or function_exists() check to avoid repeated compilation.
The PHP scoping rules for includes can be confusing for inexperienced PHP programmers, and can even catch out experiences ones occasionally. Any included functions and classes are always global. Also function invocations within the included code can still have global side-effects (a good example is the define function used to declare global constants), as are simple assignments to global constants such as $GLOBALS[‘fred’] = …; Any other assignments are have a scope determined by the routine that included the code (which under normal PHP scoping rules could then be global if the include was at the top level or the variables were explicitly declared as global in the calling code). Lastly, if an include contains a return statement, then the compile scan is terminated and the return value passed back as that of the include function.
A good example of how this can be very confusing is if a class method invocation includes code.
- Any functions which are defined are global and nothing to do with the class from which they were defined
- However any references to $this will work because this variable is automatically defined in the local scope of a non-static method call, but only if the file is included for every invocation of the method. Double yukk!! phpBB makes regular use of $this (forgive the pun) in its compiled templates, but at least only in these templates.
So my preference is to keep the content of include files simple and make sure that the include module makes no implicit assumptions about the including module. The easiest approach here is to contain only items which have global scope:
- Class and function definitions
- Defines
- Explicitly global assignments such as $GLOBALS[‘fred’] = array( … ).
- Calls to one-time / initialisation functions.
If you want to use a module to initialise local-to-function data then you can use a return statement and this will be the return value of include call. But again, I feel it is safer simply to include this data in an initialisation function which you can call immediately after the inclusion. Anything else is just too dangerous as this creates implicit dependencies between the including and included module, and this can create horrible-to-debug side effects if you want to refactor code.
phpBB’s use of inclusions and it’s inclusions strategy
The phpBB coding guidelines already largely follow my preferred practice. However, one nice feature of PHP is that it exposes its tokeniser as a callable interface, which enables the development of simple filters to check code structure. At Listing 1 is an example that scans a source file skipping class and function definitions as other safe constructs leaving the lines to ‘eyeball’ for problems. At Listing 2 is a filter that generates an analysis of the source. Both outputs can be “grepped” and manipulated further in a speadsheet. The advantage of using filters like these is that when you are having to review a fairly large application like phpBB (the base package is 180K lines), then you only need to check a few hundred lines of non-conforming code rather than the whole lot.
- phpBB has a small number of entry points which can validly by invoked through a request URI: faq, feed, index, mcp, memberlist, posting, report, search, style, ucp, viewforum, viewonline, viewtopic and adm/index. These implement the obvious application functions. Another three entry points implement “behind the scenes” requests that are typically embedded in main HTML pages as references: cron to implement pseudo-batch housekeeping, download/file to handle attachments, adm/swatch a progress meter for various ADM functions. All other code modules are designed to be included into one or more of these top level functions, and the flag IN_PHPBB is used to police this.
- The includes directory hierarchy is used to contain most of these includes files. In this hierarchy already follow my suggested good practice, though the UTF routines make frequent use of includes to return mapping structures. I found only one case of what I would view as dangerous coding, where I suggested a code change by a phpBB tracker request. This fix here is to add the following line to two modules (functions_privmsgs and ucp/ucp_main) to hoist the variables from local to global scope:
global $global_privmsgs_rules, $global_rule_conditions;
- The language directory hierarchy is use to contain the mapping tables for (national language) NL support. There is a sub-director per language, which contains a standard list of modules each of which has the standard preamble below (though spread over 13 lines to make it ‘more readable’) which extend the array $lang. In the other exceptions help_bbcode and help_faq initialises $help, search_ignore_words initialises $words and search_synonyms initialises $synonyms , and again all of these variables are implicitly global arrays. Yukk! The strategy adopted by the UTF routines which is simply to return an anonymous array and leave it to the calling module to store this in the correct variable is a far cleaner and safer coding practice.
- The cache directory is used to contain the compiled templates. However, the compiler is a poor implementation which adopts quite bad programming practices and this needs to be refactored or rewritten.
- The application is inconsistent in its use of the specific include variants, for example within the includes directory hierarchy: require (37 times), require_once ( 0 times), include (174 times) and include_once (103 times). I can’t discern any logic here and I suspect that the choice was largely left to the individual preferences of the module coder. Also like echo, these include statements are a language construct rather than real functions, so the use of parentheses for the argument is optional, but out of these 314 uses these and only one (line1584 of includes/session.php) omits the parentheses.
I used the filter at listing 2 to investigate any consistency in the layout of the include parameter within individual uses. Whilst the significant majority (364 inclusions out of 388 across all php files) follow one of three variants, for example such as include($phpbb_root_path . 'includes/some_module.' . $phpEx), that simple pattern replacement could processes automatically, 20 cases do not and removing the root path and extension would involve code changes to the including module and these would each require testing.
So whilst some more comprehensive code tidy-up to improve code maintainability as part of a 3.1 or 3.2 refactoring, this just isn’t worth the extra regression work for a proof of performance pilot, as it doesn’t materially impact the principle of what I am trying to demonstrate here or the run times. A good example here is includes/acp/acp_language.php which implements the ACP functions for language packs if you want to drill into these yourself.
My pilot implementation of code globbing
The key to my proposed dynamic load strategy is to replace all include occurrences with a new function phpbb_load_module(), which is itself declared in the common.php module. These can take up to three arguments:
- the module name to be loaded as a standard argument
- an (optional) boolean to allow execution to continue if the module is missing or compilation fails
- an (optional) boolean suppress the “_once” function.
phpbb_load_module does the following
- It will only load a module once, and it does this by maintaining an array of the return status of the individual includes, which are assumed to be idempotent. If the array key corresponding to the module name requested does not exist then the module is loaded and the array value is set to the return-value of the included module. In all cases the routine returns the array value, unless the module doesn’t exist or gives a compile error in which case this is reported and treated as fatal in all cases.
- If a global prime-cache flag is set, then it will also append a copy of this module to a globbed cachefile in the cache directory. If the installation includes the Tokenizer extension, then this copy is first compressed to remove whitespace and comments. Return statements are transformed to become assignments to the load status array.
On initialisation in common.php, if the globbed cache-file exists then it is included, and doing so this will both compile in all included code within it and initialise the corresponding return status array entries in a single file access, so that any subsequent phpbb_load_module calls for the same module will bypass the inclusion and return the status from the array. If it doesn’t exist then the global prime-cache flag is set to force the phpbb_load_module calls to build the cache-file. A hook into the exit handler rounds this process off.
Hence for example, on the first invocation of viewforum, say, the code follows the normal code path including the required modules and building up the glob; on subsequent requests for viewforum, the code glob will be loaded and the following phpbb_load_module calls for pre-loaded modules will return the required value without doing any further file I/O. Now this doesn’t preclude a code path which requires further modules which will generate more include file accesses but at least the core set are loaded in one go,
In order to keep the change impact as small as possible:
- I do a 1-1 swap-out of all includes for a phpbb_load_module call. However I decided to exclude one module, includes/template, from this swap-out for this demonstration phase. This is because the mechanisms for including template within the template engine are convolved, and integrating them into this preload concept would require some major rework of the templating engine itself. A job for another day maybe.
- To avoid having to rework all of the includes which generate assignments to the arrays $lang / $help / $words / $synonyms, I wrap the actual include statement in a private function which checks for the creation of any of these arrays and treats this as equivalent to a return value. Yes, this is a bit of a botch but it saves having to change and retest a lot of include files.
- I have added the phpbb_load_module definition and initialisation to common.php. This means that I have to do one more module load than the absolute minimum, but this removes the need to make material mods to the index.php and the other request entry points.
- I still have a small number of other changes to make (e.g. to the routines which include language and help files since these now appear to return the values rather than set $lang, etc. An example of this rework is that each function calls $user->setup() with a list language files needed, and this in turn need to include these files for the language selected and for each ultimately includes a language where the tables are merged into $lang as side-effect. My generator moves this into a return value, so I can get rid of these side-effects, so this include fragment now looks like:
$include_result = phpbb_load_module($language_filename); .. $this->lang = array_merge( $this->lang, $include_result['_RESULT_LANG'] );
- The only other change is a hook into the ACP Purge Cache function which deletes all glob modules so that these can be refreshed after configuration change.
So how does this version perform?
I reran a system trace of this new version a few times and got very consistent results. The output of one is shown at Table 1 (and which you might compare with the original Table 1 in my earlier article.) Instead of loading 18 application script files and 14 data cache files, I am now loading 6 script files, 1 globbed script file and one data cache file. The runtime on this fully cached case is still roughly 10% shorter, despite the additional processing overhead of reading the two compressed files. By hoisting the load of each phpbb_load_module.php into each top level request script, I could also cache the common.php. In principle, I could in also include the templates in the globbed script file, but the templating engine is a bit of a mess and could do with a rewrite. This would not only shorten the generated templates but would also allow me to include them in the cache. A job for another day.
This 10% figure is with files fully cached. The real saving is on that first request when they aren’t and this new version will typically save perhaps 25-50 physical I/Os in this case. Given that most LAMP servers end up disk I/O bound, this avoids significant aggregated I/O queuing delays as well as reducing the load on the server. I am collecting some real performance stats on a test instance on my Webfusion service and this fix combined with the data cache discussed in my previous article approximately half the overall response time on first request. The set of mods is basically working fine for the main path code, but I am triggering some of these “side-effect” artefacts which cause some gremlins that I still need to shake down, before I could regard this as anything more than alpha code suitable for proof of principle.
I have one last performance area that I want to look at, and that is on my LAMP platform, as is the case on many such offerings, the MySQL engine is on a different server. This in turn means that the SQL calls are effectively RPCs (remote procedure calls) across the internal network infrastructure in the data centre. I just want to switch on SQL logging on my test rig to see if it would be worthwhile making any optimisations in this area, but this analysis is also the subject of a different article.
Table 1 – Summary of Strace on viewforum.php&f=2
Time (mS) System Call 0 execve("/usr/bin/php", ["php", "-r", "$_GET[\"f\"]=\"2\"; include( \"viewfo"...], [/* 17 vars */]) = 0 ... 94 open("/var/www/forum/viewforum.php", O_RDONLY) = 3 97 open("/var/www/forum/common.php", O_RDONLY) = 3 99 open("/var/www/forum/includes/phpbb_load_module.php", O_RDONLY) = 3 101 open("/var/www/forum/cache/viewforum.php.gz", O_RDONLY) = 3 144 open("/var/www/forum/cache/cache_data.gz", O_RDONLY) = 3 150 open("/usr/share/mysql/charsets/Index.xml", O_RDONLY) = 4 162 open("/var/www/forum/cache/tpl_prosilver_message_body.html.php", O_RDONLY) = 4 163 open("/var/www/forum/cache/tpl_prosilver_overall_header.html.php", O_RDONLY) = 4 165 open("/var/www/forum/cache/tpl_prosilver_overall_footer.html.php", O_RDONLY) = 4 180 exit_group(0) = ?
Listing 1 – Example of using the PHP token parser to validate coding
<?php /* cd /var/www/forum/includes; rm /tmp/yy.log; for f in $(find * -name \*.php); do sudo php ~/work/forum/checkTopLevelAssigns.php $f >> /tmp/yy.log; done; vi /tmp/yy.log */ $src = file_get_contents( $argv[1] ); $tokens = token_get_all($src); $output = ""; define('T_DEFINE', 999); // These are in ascending numeric order so the same order on the echo below $modType = array(T_REQUIRE_ONCE=>0, T_REQUIRE=>0, T_INCLUDE_ONCE=>0, T_INCLUDE=>0, T_FUNCTION=>0, T_RETURN=>0, T_CLASS=>0, T_DEFINE=>0); reset( $tokens); while( list($i, $token) = each( $tokens ) ) { if(is_string( $token ) ) { // simple 1-character token $output .= $token; } else { // token array list( $id, $text ) = $token; switch( $id ) { case T_FUNCTION: case T_CLASS: $modType[$id]++; while( list($i, $token) = each( $tokens ) && (is_array( $token ) || $token != '{' ) ) {} $nParen = 1; while( $nParen > 0 && ( list($i, $token) = each( $tokens ) ) ) { if( !is_array( $token ) ) continue; $nParen += ( $token = '{' ) ? 1 : ( ( $token = '}' ) ? -1 : 0 ); } break; case T_STRING: $j = $i + ( (is_array($tokens[$i+1]) && ( $tokens[$i+1][0] == T_WHITESPACE )) ? 2 : 1 ); if( ( $text == 'define' ) && !is_array( $tokens[$j] ) && $tokens[$j] == '(' ) { $id = T_DEFINE; // *** and fall through } else { $output .= $text; // any other strings -> output "as is" break; } case T_INCLUDE: case T_INCLUDE_ONCE: case T_REQUIRE: case T_REQUIRE_ONCE: case T_DEFINE: case T_RETURN: $modType[$id]++; while( (list($i, $token) = each( $tokens )) && (is_array( $token ) || $token != ';' ) ) {} break; case T_COMMENT: case T_DOC_COMMENT: case T_OPEN_TAG: case T_CLOSE_TAG: break; case T_WHITESPACE: $output .= ( ( strpos( $text, "\n" ) === false ) ? ' ' : "\n" ); break; default: $output .= $text; // anything else -> output "as is" break; } } } echo "** $argv[1]\t". implode( "\t", $modType ) . "\n"; $from = array( "/\s*\n[\s\n]*/s", "/\s{2,}/s", "/^\s+/m", "/global\s.*/m", "/if \(!defined\('IN_PHPBB'\)\)\n\{\nexit;\n\}\n/s", "/^\$GLOBALS\[.*/m" ); $to = array( "\n", " ", "", "", "", "", ""); echo preg_replace( $from, $to, $output ), "\n"; ?>
Listing 2 – Using the PHP token parser to analyse include patterns
<?php $file = $argv[1]; $src = file_get_contents( $argv[1] ); $lines = explode( "\n", $src ); $tokens = token_get_all($src); reset( $tokens); while( list($i, $token) = each( $tokens ) ) { if(!is_string( $token ) ) { list( $id, $text, $lineNo ) = $token; switch( $id ) { case T_INCLUDE: case T_INCLUDE_ONCE: case T_REQUIRE: case T_REQUIRE_ONCE: $type = token_name( $id ); $arg = ""; while( (list($i, $token) = each( $tokens )) && (is_array( $token ) || $token != ';' ) ) { if( is_array( $token ) && $token[0] == T_WHITESPACE ) continue; $arg .= is_array( $token ) ? $token[1] : $token; } if( $arg[0] == '(' ) $arg = substr( $arg, 1, -1 ); $std1 = preg_match( '/^ \\$ phpbb_root_path \. .*? \. \\$ phpEx $ /x', $arg); $std2 = preg_match( '/^ "\{ \\$ phpbb_root_path \} .*? \. \\$ phpEx "$ /x', $arg); $std3 = preg_match( '/^ " \\$ phpbb_root_path .*? \. \\$ phpEx "$ /x', $arg); $std = $std1 + $std2 + $std3; // Space-compress source line so that it can be loaded into a TSV file $srcLine = preg_replace(array('/^\s+/m','/\s+/'),array('',' '),$lines[$lineNo-1]); // Output as TSV file for loading into a spreadsheet echo "$file\t$lineNo\t$std\t$std1\t$std2\t$std3\t$text\t$arg\t$srcLine\n"; break; default: } } } ?>