Notes on a highly scalable WordPress Delivery Platform - Over 10K live requests / second, 20K concurrent connectionsBy Angsuman Chakraborty, Gaea News Network
Saturday, January 31, 2009
As you may be aware, if you are following my tweets, that we are testing a highly scalable WordPress delivery platform, which can serve over 10000 requests per second from a single server and handle over 20000 concurrent requests without failure. Sounds amazing? then read on…
WordPress, as you may well be aware, is a resource hungry (but well featured) blogging software. How resource hungry is WordPress?
WordPress, a Performance Nightmare & Resource Godzilla
We tested WordPress with a copy of Simple Thoughts (this blog) database on a dual processor quad core Xeon 2 Ghz processor ( i.e. 8 core of 2 Ghz each ) with 4 GB RAM and SATA-2 harddisk on RAID-1 (2 X 7.2K RPM disk = effective speed of 14400 RPM on read), a pretty high-end machine if I may say so. All standard MySQL optimizations were done. PHP has eAccelerator enabled.
WordPress, running on nginx (which is far better than Apache), saturated the server, serving only 50-60 requests per second and without any updates (new posts, comments, pingback, trackback etc.). At 100 requests per second, the server froze. It had to be cold-rebooted (pressing the power button to switch it off, then on) from our data center to bring it back up.
Note: WordPress executes lots of MySQL queries just to render a single page. Many plugins also execute SQL queries, thereby adding to the load. More often than not, you will find MySQL to the single biggest bottleneck in WordPress performance.
You can read my Top 5 tips to improve WordPress performance for easy optimizations you can do.
WordPress sites traditionally use wp-cache 2 and now wp super cache plugins to cache the pages for faster delivery. WordPress.com, hosted platform for WordPress blogs by WordPress company, also uses one of this caching plugins. It works to some extent. We tested it with a maximum of 400 concurrent requests on a single medium-range server for a single URL with Apache Bench. The requests / seconds was much less.
The bad news is that in real-life scenario, both the plugins perform much poorly because of several factors:
- Every comment on a page causes it to re-generate the page, thereby increasing the load. Read the solution for increasing WordPress performance on heavily commented blogs.
- The cache management of both these plugins is dumb, to put it mildly. Lots of unnecessary regeneration is done which increases load on the system.
- It includes complicated logic for handling plugins which may change page content
- The cache is served by PHP, which is not a speed demon in its best days
The bottomline is that if your blog continues to get popular neither of these cache plugins will suffice.
But there is a bigger problem.
How to ensure stability of WordPress sites?
Traditionally WordPress is run on LAMP stack, where Apache web server & MySQL database is part of the equation. One of the biggest problems is that at high loads WordPress site will frequently freeze for extended period of time and sometimes go completely dead. This is clearly unacceptable.
While there are ways to scale WordPress, how can be ensure stability even at unexpected high load?
Approaches to scaling WordPress
First you should have MySQL running from a separate server to distribute the load better. You can even have multiple Apache HTTP servers pointing to the same MySQL instance. The load will then be distributed with a proxy in the frontend like haproxy or pound or nginx
BTW: We are always talking dedicated server here guys. If you are on shared server, this article is not for you (except wp-cache or super cache part), at least not yet
If you are courageous you can try using nginx instead of Apache as your HTTP server. nginx is less resource hungry and better performing than Apache. The nginx uses different configuration file format. Expect to make lots of changes to make nginx work exactly the same as with Apache. Don’t believe what you read on the net about running WordPress on nginx. While most of them are partially true, none of them covers all the bases. They haven’t clearly tested the WordPress installation thoroughly. You cannot just cut-n-paste their configuration to run Taragana network of sites, for example.
Soon MySQL will again become the bottleneck irrespective of how powerful server you use for your database. There are five distinct roads in front of you:
- MySQL clustering - high cost, high RAM requirement. Last I checked, it had the requirement of having the whole database resident in memory.
- Master-slave replication - Have one MySQL server for insert & updates, with multiple slave servers to serve the pages. This requires changing one of the core WordPress file wp-db to re-direct the requests appropriately. WordPress.com uses this approach.
- Master-master replication - This allows all the MySQL databases function as equals and do reads as well as inserts & updates. We have tested this option and it works well.
- Sharding - Breaking up the databases to different machines. This could be very useful for WordPress mu where you can move non-shared tables of an user to its own databases. However this will have to be used with other options when some users outgrow their servers. This is complicated to setup and will require lots of testing. It will most likely require you to change core WordPress code.
- Move different blogs to different servers and pray each of them do not outgrow their servers. This, like sharding, will need to be supplemented with other methods discussed above.
None of these will ensure that your server will not crash with very high loads. You just have to hope (and test) that your real load doesn’t exceed your capacity.
Simple solution for the complex WordPress problem…
I started looking for a simpler solution, an architecture that will ensure that the site never crashes, even at ridiculously high load as well as having a tremendously high throughput (crossing the 10K requests per second barrier) on a single dedicated server. After some research we came up with a simple solution which solves both of the problems.
Let’s first look at the stats.
We served a copy of Simple Thoughts blog (~ 3000 posts and 10000 comments) from a single server (dual processor quad core 2 Ghz processor with 4 GB RAM, 2 X 250 GB RAID-1 array with effective read speed of 14.4 K rpm). We tested it with a simulation of live load, created from our log files with httperf. We also used Apache Bench and our own Site Load Tester for comprehensively testing the setup and to ensure that there isn’t any mistake in the results.
We were able to serve over 10000 requests per second with 1.6Gbps throughput and without failures! We also handled over 20000 concurrent connections on this server without failures.
Never did the CPU usage go above 10%. We have room to grow.
Note: I didn’t test it beyond 20K concurrent because it will not be required in real life as we will hit bandwith limit sooner
Java serves WordPress?
Yes, you have heard it right. We are extensively using Java technology & Grizzly server in our architecture to serve WordPress pages. It is an integral part of our architecture. What we are doing with Java cannot be accomplished with PHP. We are using nginx as well as fastcgi (for PHP).
So where is the the dream WordPress setup now?
We are still rigorously testing the new setup at our new dedicated server in Chicago. We hope to go live next week on Simple Thoughts and other blogs.
BTW: You can get live updates about our progress and more if you follow me on twitter.