How to Stop Crashing / Hanging of php-cgi / spawn-fcgi with nginx / lighttpd

By Angsuman Chakraborty, Gaea News Network
Tuesday, March 24, 2009

One of the frequently complained problems while using nginx or lighttpd, two popular and rapidly growing web servers which are faster and lighter alternatives to big daddy Apache web server, with php (through fastcgi interface) is that the pool of php-cgi which ultimately serves the php pages seems to hang frequently and without any apparent reasons. This is also the reason you will see the most common error on nginx website is the dreaded 503 - Gateway Time out. These errors seems to happen without much pattern and it took us some time to find the cause and fix it. In this article we will not only show you how to get rid of this error but also to enable pre-defined QoS with different PHP based services so that even if one pool of fast-cgi servers go down, the others continue to operate unaffected. In short if you have a popular website or blog running on nginx or lighttpd, then you should read it. Please mention it in your favorite social bookmarking service and digg it so others can see it too.

How to run Nginx with PHP - An Overview

nginx (and lighttpd) web server supports PHP using Fastcgi protocol. PHP supports Fastcgi through the php-cgi binary. However for ease of configuration almsot everywhere you will find recommendation to use spawn-fcgi or similar such wrapper. spawn-fcgi wraps up the commands of php-cgi in a easy to use interface. After starting php-cgi, spawn-fcgi exists and have no further control over it.

A newer option is to use php-fpm, a php fastcgi process manager. However it requires custom compilation of PHP from source code, which may not be everyone’s cup of tea. Also when we started migrating our system to using nginx from Apache, that was one risk we didn’t want to take. We went with the simpler and safer option of using spawn-fcgi which internally uses php-cgi as described above.

The little hanging problem with spawn-fcgi / php-cgi

The biggest problem with using php-cgi through spawn-fcgi is that often the php-cgi pool hangs and without apparent reason. It is not a temporal problem and it doesn’t auto-recover. This leads to the famous 504- Gateway Time out error in nginx servers. In this article we will present a two pronged approach to solve this problem and additionally give more reliability to critical services.

Why spwan-fcgi is not responsible for hanging

All reports of spawn-fcgi hanging or crashing (that you will find through Dr. Google) are incorrect. spawn-fcgi is just a wrapper. Any problems are caused by and happens in php-cgi. php-cgi is responsible of starting a bunch of php-cgi processes as workers and manages them. php-cgi implements the Fastcgi protocol and communicates directly with nginx / lighttpd web server.

The single biggest lacuna of spawn-fcgi is that it doesn’t directly provide a way (through options) to specify the maximum number of connections to be served per process, before renewal.

How to protect critical PHP services and guarantee QoS

On our web server, we run several services. Some of which are very critical and have to be running all the time like serving web pages etc.

There are others like commenting function for example, which is important but not as critical. Comment spamming is one of the greatest pains of running a blog (how to cure?). Evil spammers hammer the servers, often dozens at a time, and can bring down even a high-end server. While we want to be able to allow everyone to submit comments, we don’t want that it to take over the server and prevent others from viewing the web pages. So we created a separate pool for it.

Similarly we identified three category of services with different priorities and created three php-cgi pool with different number of processes in each. We wanted to ensure that each of them ran independently of the load generated by others and if one of the services was not unavailable due to the hanging issue mentioned above, then the others should still continue to function.

With this setup we can guarantee QoS to certain critical PHP based web services. However that doesn’t solve the core problem of hanging, does it?

Why php-cgi hangs and how to solve it

While researching the topic across the web, it soon became apparent that php-cgi has nagging issues of stability on prolonged use. It appears the simplest way to ensure stability while using php-cgi is to recycle processes frequently. Unfortunately spawn-fcgi doesn’t provide a simple way to do it. We found out an important, but less advertised environment variable php-cgi recognizes: PHP_FCGI_MAX_REQUESTS

This variable when set, specifies the maximum number of requests served by a php-cgi process before it is killed and a new process spawned. We have set it to 1000 and it seems to be working fine in this setting, even under high loads and prolonged use. You can just set (export in bash) this environment variable before calling spawn-fcgi and that’s it. We have simplified the setup using a script and so can you.

To summarize…

We recommend a two-pronged strategy for ensuring stability and guaranteed QoS for PHP applications running on Nginx and Lighttpd:

  1. Create separate pool with differing number of processes in each for each major group of services / applications.
  2. Set maximum number of requests to execute per php-cgi process.

Cheap Gold
August 10, 2010: 2:43 pm

After reading it again, I understand finally. Thanks

August 10, 2010: 2:40 pm

I still don’t understand why spwan-fcgi is not responsible for hanging?

August 10, 2010: 2:06 pm

Creating a separate pool with differing number of processes should work.

August 10, 2010: 2:03 pm

I agree with your two pronged strategy

August 6, 2010: 7:28 pm

I like your website

June 7, 2010: 1:54 am

excellent insight…let people know what happened and how to prevent!:)

June 10, 2009: 7:15 pm

This is the exact problem I have had in the last 4 months with no help from forums. Glad you explained it so well. I have a question: Am using /etc/init.d/php-fastcgi. Just where do I set this maximum value? Just don’t where.

will not be displayed