I discussed the common types of ISP hosting services in a previous article, Use cases for phpBB, and I use one of these hosting services myself — a shared web service provided by the ISP Webfusion — to host my ellisons.org.uk domains. I now want to describe some details of configuration of this service (as Webfusion doesn’t provide this information on its website) and how I can use this knowledge to tailor and optimise my applications to run on it. This information might help others who host domains using this service. I have split the following off into related articles:
- Performance in a Webfusion shared service and guidelines for optimising PHP applications
- Using .htaccess files on a Webfusion shared service
The Fusion Professional (Linux) offering
Webfusion provides an overview of this service on its website, and some FAQs that might be best described as superficial Nonethelss, an experienced developer / admin can examine the configuration through normal user access and standard unprivileged system commands (and without any “hacking” or breach of security policies).
This shared service infrastructure runs on a farm of dual-processor quad-core servers (totalling 8 Intel E5410 Xeon cores per server, each typically with 8Gb RAM). These servers each run a standard LAMP stack that you might expect to see in this sort of service offering. Any hosted domain is served by a Webfusion DNS nameserver (in my case ns.hosteurope.com) that resolves any *.ellisons.org.uk name-service requests to a fixed registered IP address (18.104.22.168) which is the public side of a firewall into the Webfusion datacentre. This seems to be a standard set-up where a load-balancing firewall passes any web requests for this IP address to a Linux server farm located in a 10.* private address space within the datacentre. This is for scaling and security. The firewall also maps any Webfusion web connections coming into the public 22.214.171.124 onto HTTP port 8002 (123-reg to 8003, etc.) on the 10…. IP addresses that the server farm listens on.
The Apache configuration
The servers share a common Apache configuration, which includes a complex set of rewrite rules to map the individual HTTP and HTTPS requests onto the correct virtual host. Skipping some of the gritty details:
- The configuration includes a virtual host definition for each ISP it hosts, so in the case of Webfusion, so that the directive can be used to define Webfusion-specific rewrite rules etc.
- The configuration defines two script-based rewrite maps, wfvhost and wfgetbase. The first does an intelligent map of the site name (that Apache determines) using the right-hand elements against the list of domains hosted by the service to return the domain name (so in my case .ellisons.org.uk returns ellisons.org.uk). The second converts a domain name into corresponding base directory for the service. This is a subdirectory of /websites/LinuxPackage02 in the case of all Fusion Professional (Linux) services under a directory hierarchy based on the first 6 letters of the domain name, which is /websites/LinuxPackage02/el/li/so/ellisons.org.uk in my case. Using a directory hierarchy like el/li/so is a standard Linux technique to improve filesystem performance in directory scanning and to facilitate the load-balancing of services across multiple servers (which is important when the ISP hosts thousands of shared services as in the case here).
- A rewrite rule to map the Shared SSL certificate address onto the correct virtual directory. (In my case https://fusion.webfusion-secure.co.uk/~ellisons.org.uk/ onto my base directory).
- A rewrite rule to set the environment variable VHOST to the domain name
- A rewrite rule to set the environment variable DOCUMENT_ROOT_REAL to the document root (the public_html subdirectory of the base directory) which is then inserted into the real script path. Note that this is really a temporary variable as mod_rewrite then sets DOCUMENT_ROOT to this same directory.
- As a result if you check the phpinfo for my site, you will see that various server / environment variables are available for use in .htaccess files and within scripts, including the following: DOCUMENT_ROOT. GATEWAY_INTERFACE , HTTP_ACCEPT, HTTP_ACCEPT_CHARSET, HTTP_ACCEPT_ENCODING, HTTP_ACCEPT_LANGUAGE , HTTP_COOKIE, HTTP_HOST, HTTP_KEEP_ALIVE, HTTP_USER_AGENT, PWD, QUERY_STRING. REMOTE_ADDR. REQUEST_METHOD, REQUEST_URI, SCRIPT_FILENAME, SCRIPT_NAME, SCRIPT_URI, SCRIPT_URL, SERVER_NAME, VHOST. (The server variables are always populated from the environment variables for any CGI interface, so these two lists are the same.)
- The configuration also contains rules to enable FollowSymLinks and to deny web access to any .htaccess files. It also enables rich range of languages and set up various BrowserMatch filters to handle Browser funnies from non-W3C-compliant browsers (such as older versions of MSIE).
- SuPHP is used as the CGI execution framework. Whilst this is faster than using bare CGI, it can still be up to an order of magnitude slower than FastCGI or PHP invoked directly in-process from the Apache mod_php module. (The phpinfo() output confusingly references FastCGI, but ignore this because suPHP also uses the same API to invoke scripting languages such as PHP. SuPHP is used to ensure that any interpreters, such as PHP, execute any script request within the user’s UID , and this means that the operating system can enforce access separation for the different users and applications hosted by the ISP (so that no other ordinary applications or hostile co-hosted user can access my files on the server). This separation is a performance hit incurs because suPHP uses a new process to run each web request, and this carries a significant overhead of process creation and termination.
- The following Apache modules are loaded and can be additionally configured through individual .htaccess files. For further details see the Apache documentation:
- Authorization. These can be used to enable directory level access control: auth_basic, auth_digest, authn_anon, authn_dbm, authn_file, authz_host, authz_groupfile, authz_user, authz_dbm.
- Environment setting. These allow you to tailor your scripts environment variable on a per directory basis: env, setenvif.
- Directory Listing. This generates directory indexes, automatically, similar to the Unix ls command or the Win32 dir shell command: autoindex.
- Header-control. Modify HTTP headers according to directory and file-type-specified criteria: expires (generation of Expires and Cache-Control); headers (customization of other request and response headers).
- Apache-parsed html. Also known as Server Side Includes: include.
- Rewriting URIs. A per-directory rule-based rewriting engine to rewrite requested URLs on the fly, rewrite, dir.
- Content compression. By Apache before it is delivered to the client: deflate.
- Miscellaneous. mime, negotiation, actions, unique_id.
The PHP configuration
- Another consequence of using suPHP is that individual users are unable to change the php.ini configuration on a per directory basis through .htaccess files, as any attempt to include a php_flag in this file will result in a runtime error. However users can specify their own php.ini file in the directory that the script executes in. This will be either their own public_html directory or one of its subdirectories. I suggest that you do so in most circumstances, (though you can use the Webfusion defaults for development). I use a common php.ini in my public_html directory, and symlink to this from any subdirectories where I execute scripts. You will need to override the default log_errors with a log file in your file hierarchy as you don’t have access to the Apache error log. These are the settings in my php.ini.
magic_quotes_gpc = Off register_argc_argv = Off register_long_arrays = Off session.gc_divisor = 1000 session.hash_bits_per_character = 5 session.gc_maxlifetime = 7200 short_open_tag = Off track_errors = Off url_rewriter.tags = Off output_buffering = On variables_order = "GPCS" log_errors = On error_log = "/websites/LinuxPackage02/el/li/so/ellisons.org.uk/public_html/_private/PHPerror.log" upload_max_filesize = 8M date.timezone = "UTC"
- You are free to change nearly all the php directives, but since php loads the ini files in the /etc/php.d after it has loaded your php.ini, the following items will override any corresponding entries in your directory-specific php.ini file.
memory_limit = 64M error_reporting = E_ALL & ~E_NOTICE display_errors = On register_argc_argv = On default_socket_timeout = 15 session.save_path = "/tmp" mysql.allow_persistent = Off
- However, excepting register_argc_argv and mysql.allow_persistent (which you shouldn’t want to reset anyway), these directives can be overridden within a PHP application using the ini_set call, but remember that the PHP interpreter does a basic syntax scan on loading the script and before these line are executed, so compile errors on the initial script load could still be displayed. Here are the standard preamble of calls that I include in my scripts:
ini_set( 'error_reporting', E_ALL | E_STRICT ); ini_set( 'display_errors', False );
- As I said, the default Webfusion PHP configuration values are suitable for development configurations, but not for production use. (You can review these through the phpinfo query). If you want to configure your service with a more typical production configuration, then you might want to do as I do: enable output buffering, and configure error reporting to suppress script errors to the user. Also remember that you will need to have a strategy for collecting PHP runtime errors as you do not have access to the default (Apache) error logs.
- One setting of note in the default php.ini file is that magic_quotes_gpc = On. The use of magic quotes is now deprecated and highly discouraged as this encourages sloppy coding. It will be removed in the next version of PHP. If your fields include quotes then the processing is context sensitive and is best done in the appropriate context (e.g. using the function mysqli_escape_chars in the case of MySQL database insertions).