The trigger for this article was a conversation with Mark Fisher, someone that I mentored through his early career, and who has remained a close friend over the years, though he has since gone on to set up his own IT company, Jemmac. However, one of his leisure interests is to run a forum, thezone:mk, with byline “all things football in Milton Keynes”; this uses the phpBB forum engine, and runs on a LAMP-based ISP shared service (Fused Network). So I had a look at this in passing and felt that the installation could do with being tuned to improve general responsiveness. I finished an earlier article, Performance in a Webfusion shared service and guidelines for optimising PHP applications, with the comment “I also have a phpBB application running on my shared service. I will also document how I have tweaked this to run more efficiently on the shared service. So watch this space.” So I feel that it is now time to use this site and my own private phpBB instance as vehicles to see if I could achieve any material performance gains in tuning a leading open-source engine.
I used the Google chrome developer tools to instrument the application response for three separate applications: my blog, my phpBB forum and the thezone:mk. All three applications run on suPHP-based LAMP shared services, and the overall request rates are quite low, so the file data relating to these applications will rapidly be flushed from the webserver system filesystem caches. This means that a visitor will get very different responses for first and subsequent page views, as the latter if done in close succession to the first will still have the content cached in the server’s filesystem cache. (Another tool which I use here is the Google Page Speed plugin for Firefox.)
I did a quick network response test twice: once having cleared the lobal browser cache and after some time since last access to the site; and once 30 secs later. This is to try to activate the page with an empty cache and once to access it with a full file cache. I did these timing early in the morning when the server loads are quite light, and in summary the timings that I got were:
|Application||Onload Time||DOM Content Loaded Time||# Requests||Total Transfer Size|
|Blog Home Page #1||0.84s||0.57s||7||14 Kb|
|Blog Home Page #2||0.64s||0.38s||7||5 Kb|
|My forum #1||4.08s||3.25s||26||41 Kb|
|My forum #2||1.09s||1.11s||26(2)||11 Kb|
|TheZone:MK #1||2.20s||1.15s||24||99 Kb|
|TheZone:MK #2||2.36s||1.03s||24||78 Kb|
Unfortunately the Googlebot had been indexing the TheZone:MK, so it’s file caches were probably primed. My phpBB installation is pretty much out of the box, as is Mark’s. I haven’t done any tuning other that the HTACCESS lines to optimise caching, but you can see the immediate benefit of doing this: avoid transferring quasi-static data from the server to the client whenever possible, and always specify compression for text-type content such as HTML, CCS and XML streams.
- I use the ExpiresByType directives to ensure that the browser uses local caching when appropriate, and in this cache 24 out of the 26 requests are from the browser cache. The default results in conditional requests with a 304 response from the server instead. This will save on network load and latency, but the server still needs to process each request to determine that no transfer is needed.
- I also force compression of the PHP generated CSS stylesheet using an AddOutputFilter directive. This drops the size of the transferred stylesheet from some 69 Kbytes to just over 13 Kbytes.
- My server’s VirtualHost includes KeepAlive On and KeepAliveTimeout 5 directives, which means that server requests are shared across a single TCP/IP connection. This isn’t a parameter that I can tune, but it’s one that helps with general performance. I suspect than on Mark’s server KeepAlive is set to Off because all transfers have Connection: close responses so a separate TCP/IP connection is needed for each request.
So these are all factors which impact on server load and perceived user responsiveness. I know enough about Apache tuning to get these right. However an average forum administrator will not, and unfortunately the phpBB documentation gives user-friendly little help here. So fixing these will help.
Even so, just why is the responsiveness of my forum a factor of 5 down on first page-load compared to my blog, and a factor of 2 for repeated request? This is the nub of my 3x challenge. The key issue here is that the two use case scenarios of running phpBB on a dedicated server (or VM) and on a shared service give rise to different non-functional requirements which should influence the application’s design. In this first case the system admin has root access and full control over both the Apache and PHP runtime configurations; thus techniques such as using PHP opcode caching and persistent PHP processes dominate performance considerations. In the second case the admin has little or no control over the Apache and PHP configurations. In such circumstances, the application developer must be sensitive to the factors which kill performance.
As I have demonstrated in my blog architecture, this can significantly enhance application performance on such shared services. To put it bluntly, the phpBB development team haven’t (and to be fair the WordPress team also). So what I want to attempt to do in my next couple of articles is to show have with some simple architectural changes, phpBB could be made to be significantly more performing not so much for the top 5% which run on dedicated infrastructure, but for the other 95% of smaller forums which run on shared infrastructure.