If you haven’t noticed yet, this website is powered by WordPress and speed is optimized using W3 Total Cache… Previously I was using WP Super Cache but recently decided to try W3 Total Cache as it’s feature set (minify, cdn, etc.) is looking better. For my setup it actually worked better so I kept it as my caching plugin.
But it took me a while to figure out it’s “Cache Preload” option is not working well or at least not working as I expected. “Automatically prime the page cache” was checked but pages were not in cache…
After searching a bit I found the culprit. W3 Total Cache relies on internal cron of WordPress and it’s only triggered by site activity. If your site has no visitors for a while loading the pages will be slower because: 1. W3TC will check if the current page is cache and if it’s expired. 2. Needs to clean up that expired page and re-generate a current copy. 3. Finally serve that file.
My first attempt to fix that was trying to put a real cron job for wp-cron.php but that didn’t worked well. Cron was working but W3TC still failed to prime the cache. There are couple of threads about that at the wordpress support forums.
Optimus Cache Prime (OCP) is a smart cache preloader for websites with XML sitemaps. It crawls all URLs in a given sitemap so the web server builds cached versions of the pages before visitors or search engine spiders arrive.My next attempt was trying to use Optimus Cache Prime, a Phyton script written by Patrick Mylund Nielsen.
That was exactly what I needed… But…
My host had Python 2.4 installed and ocp.py required 2.5+
I have asked my hosting (Inmotion) but they said they cannot change that on shared hosting and I should move to virtual private server (VPS) just for that… Yeah, sure!
Finally, I have decided to stop being lazy and created my own solution using PHP which I will be sharing here with you.
Idea
This PHP script has the same basic idea as Optimus Cache Prime and uses sitemap.xml as it’s source.
It will read the sitemap.xml file, parse the local URL’s listed in it. Then checks if the cache file for that url exists in W3TC. If the cache exists it will skip that url. But if not, it will visit the link using minimal resources causing the W3TC re-create cache for that page.
System Requirements
- PHP5 (Required due to SimpleXML, you may change that to use with earlier versions of PHP)
- W3 Total Cache
- WP Super Cache (Not tested! But it should work)
Usage
- Copy the code into a php file (warm.php) and place it in the site root where sitemap.xml exists.
- Review/edit the configuration options inside the script.
- Set a cron job to run that script every 5 minutes (or even every minute as the code is very easy on the system resources.)
- Sample: */5 * * * * php -q /home/youraccount/public_html/warm.php
- Thats it!
Features
- Reads sitemap.xml as a file, saving a web server call.
- Checks for the local cache file before trying to re-cache, saving resources.
- Optionally uses priority tags in sitemap.xml
- Configurable page limit per session, useful for larger sites.
- Frees memory and stops executing as soon as possible to save further resources.
- Failsafe to stop executing in case of an url or network problem.
- New: Option to fix trailing slash cache creation problem.
Code & Download
- Current version: Version 2.1 – 21 August 2011
- Download warm.php as ZIP file.
License
This code is free to use, distribute, modify and study. If you modify it please keep my copyright intact. When referencing please link back to this website / post in any way e.g. direct link, credits etc. If you find this useful, please leave a comment and share using the buttons below!
<?php // W3 TOTAL CACHE WARMER (PRELOADER) by Pixel Envision (E.Gonenc) // Version 2.1 - 21 August 2011 //Configuration options $priority = true;//Use priorities defined in sitemap.xml (true/false) $ppi = 10;//Pages to be cached per interval $delay = 0.5;// Delay in seconds between page checks, default is half a second $quiet = true;// Do not output process log (true/false) $trailing_slash = false;// Add trailing slash to URL's, that might fix cache creation problems (true/false) $sitemap = "sitemap.xml";//Path to sitemap file relative to the warm.php // Defaults for W3TC $index = "_index.html";//Cache file to check $rootp = "wp-content/w3tc/pgcache";//Root of cache //Do not change anything below this line unless you know what you are doing ignore_user_abort(TRUE); set_time_limit(600); $xml = simplexml_load_file($sitemap); $UL=$UP=array(); foreach ($xml->url as $url_list) { $UL[]=$url_list->loc; $UP[]=$url_list->priority; } unset($xml); if($priority==true) {arsort($UP,$sort_flags = SORT_NUMERIC);} $i=0; foreach ($UP as $key => $val) { $path=$rootp; $url=$UL[$key]; $sub=explode("/",$url); if($sub[3]) {$path.="/".urldecode($sub[3]);} if($sub[4]) {$path.="/".urldecode($sub[4]);} if($sub[5]) {$path.="/".urldecode($sub[5]);} $path.="/".$index; if (file_exists($path)) { if($quiet!=true) {echo "Priority: ".$val." => Skipped: ".$path."\n";} } else { if($trailing_slash==true) {$url = rtrim($url,"/")."/";} $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 15); curl_setopt ($ch, CURLOPT_HEADER, true); curl_setopt ($ch, CURLOPT_NOBODY, true); $ret = curl_exec ($ch); curl_close ($ch); if ($ret) {$i++;} else {echo "Unable to connect $url, exiting...";break;} usleep($delay*1000000); if($quiet!=true) {echo "Priority: ".$val." => Warmed: ".$path." by visiting ".$url."\n";} } if ($i < $ppi) {flush();} else {break;} } exit; ?> |
89 Responses to “PHP Cache Warmer (Preloader) for W3 Total Cache”
Add Comment