Apache Worker and PHP

Wed, Feb 13, 2008 01:09 AM

The PHP manual basically tells you not to use Apache 2 with a threaded MPM and PHP as an Apache module. In general, it may be good advice. But, at dealnews.com, we have found it very valuable.

Apache threaded MPMs

Well, first, what is an MPM? It stands for Multi-Processing Module. It is the process model that Apache uses for its children process. Each request that comes in is handed to a child. Apache 1 used only one model for this, the prefork model. That uses one process per Apache child. The most commonly used threaded MPM is the Worker MPM. In this MPM, you have several processes that run multiple threads within it. This is the one I will be talking about. You can read more on Apache MPMs at the Apache web site.

Huge memory savings

With the Apache prefork or even FastCGI, each apache/php process allocates its own memory. Most healthy sites I have worked on use about 15MB of memory per apache process. Code that has problems will use even more than this. I have seen some use as much as 50MB of RAM. But, lets stick with healthy. So, a server with 1GB of RAM will only realistically be able to run 50 Apache processes or 50 PHP children for FastCGI if each uses 15MB or RAM. That is 750MB total. That leaves just 256MB for the OS and other applications. Now, if you are Yahoo! or someone else with lots of money and lots of equipment, you can just keep adding hardware. But, most of us can't do that.

As I wrote above, the worker MPM apache uses children (processes) and threads. If you configure it to use 10 child processes, each with 10 threads you would have 100 total threads or clients to answer requests. The good news is, because 10 threads are in one process, they can reuse memory that is allocated by other threads in the same process. At dealnews, our application servers use 25 threads per child. In our experience, each child process uses about 35MB of RAM. So, that works out to about 1.4MB per thread. That is 10% the usage for a prefork server per client.

Some say that you will run out of CPU way before RAM. That was not what we experienced before switching to worker. Machines with 2GB of RAM were running out of memory before we hit CPU as a bottleneck due to having just 100 Apache clients running. Now, with worker, I am happy to say that we don't have that problem.

Building PHP for best success with Worker

This is an important part. You can't use radical extensions in PHP when you are using worker. I don't have a list of extensions that will and won't work. We stick with the ones we need to do our core job. Mainly, most pages use the mysql and memcached extension. I would not do any fancy stuff in a worker based server. Keep a prefork server around for that. Or better yet, do funky memory sucking stuff in a cron job and push that data somewhere your web servers can get to it.

Other benefits like static content

Another big issue you hear about with Apache and PHP is running some other server for serving static content to save resources. Worker allows you to do this without running two servers. Having a prefork Apache/PHP process that has 15MB of RAM allocated serve a 10k jpeg image or some CSS file is a waste of resources. With worker, like I wrote above, the memory savings negate this issue. And, from my benchmarks (someone prove me wrong) Apache 2 can keep up with the lighttpds and litespeeds of the world in terms of requests per second for this type of content. This was actually the first place we used the worker mpm. It may still be a good idea to have dedicated apache daemons running just for that content if you have lots of requests for it. That will keep your static content requests from over running your dynamic content requests.

Some issues we have seen

Ok, it is not without problems (but, neither was prefork). There are some unknown (meaning undiagnosed by us) things that will occasionally cause CPU spikes on the servers running worker. For example, we took two memcached nodes offline and the servers that were connected to them spiked their CPU. We restarted Apache and all was fine. It was odd. We had another issue where a bug in my PHP code that was calling fsockopen() without a valid host name and a long timeout would cause a CPU spike and would not seem to let go. So, it does seem that bad PHP code makes the server more sensitive. So, your mileage may vary.

As with any new technology, you need to test a lot before you jump in with both feet. Anyone else have experience with worker and want to share?

One last tip

We have adopted a technique that Rasmus Lerdorf had mentioned. We decide how many MaxClients a server can run and we configure that number to always run. We set the min and max settings of the Apache configuration the same. Of course, we are running service specific servers. If you only have one or two servers and they run Apache and MySQL and mail and dns and... etc. you probably don't want to do that. But, then again, you need to make sure MaxClients will not kill your RAM/CPU as well. I see lots of servers that if MaxClients was actually reached, they would be using 20GB of RAM. And, these servers only have 2GB of RAM. So, check those settings. If you can, configure it to start up more (all if you can) Apache process rather than a few and make sure you won't blow out your RAM.

25 comments

PhYrE Says:

How very misguided.

> Most healthy sites I have
> worked on use about 15MB of
> memory per apache process.
Most UNIX variants, and actually most operating systems these days, don't actually use 15MB of memory (as you say) even thought they say that they do. The applications themselves are shared across the operating system. The actual program is loaded only once into memory, and then mapped into each process space. Should a program change the code in memory, it creates a copy of the program and then goes from there. Not much should change the program code itself. So apache and PHP are loaded only once.

The variable data that is process specific is not significantly larger than in the case of a threaded API. The rest is all mapped by the operating system without using any more memory.

> Savings
The main savings is reducing the connection pool to databases, and using a lighter weight context switch (that has to swap out less) in switching threads instead of switching programs.

That said, good use of resources in a program fixes the first. A modern operating system on modern processors is actually very fast in its context switches. With multi-core systems and processors that are built with large-scale multitasking in mind, as wella s efficient operating system scheduling, you'll find this is less of an issue.

> Pitfalls
When something fails, your whole program goes down. I'd rather a process crash and be done with it. It makes more sense from an isolation perspective and protecting a large-scale site.

> Threading libraries
Personally, I trut the operating system's scheduler a little more than anything built into a library or program. The OS by design needs to be robust and a nice balance of performance versus safety. Threading modules just emulate a scheduler in software, but without the advanced knowledge of how the kernel is performing and what it can schedule around or with.

> FastCGI
The end effect is minimal to a mod_php system. PHP is pre-launched by fastCGI, but the context switches needed to pipe back and forth between Apache and PHP takes much more time than just sending it straight out to the client from the Apache process. On the other hand, it's not loaded when sending an image using sendfile, which can reduce memory usage. So yes- if you have a mix of PHP and static files, FastCGI can improve memory consumption by removing PHP. If your traffic isn't consistent, or if you have a lot of PHP, then you risk having lots of PHP processes, for which you might as well have just kept PHP in Apache for simplicity.

-M

doughboy Says:

@PhYrE:

The bulk of memory used by PHP is process specific. At start up, a single Apache/PHP process is only about 4MB in size. And yes, that can be shared on some OSes. However, when a PHP process needs an additional 20MB of memory (due to weird internals of PHP sometimes), that memory will not be shared. However, the threaded Apache APIs will allow PHP to use one pool of memory for several threads. So, when one thread needs 100MB for some odd reason, it is not locked to that one process forever (in Linux here. BSD will release it back to the OS). It is still locked to that Apache process, but multiple PHP threads can utilize that memory space. In theory, if every requests needed that much memory, you could see a process get very large. However, theory and reality differ in this case.

Olaf van der Spek Says:

> With the Apache prefork or even FastCGI, each apache/php process allocates its own memory.

But with FastCGI, your Apache processes will be a lot smaller.

> You can’t use radical extensions in PHP when you are using worker.

You can if you use FastCGI and worker.

> The PHP manual basically tells you not to use Apache 2 with a threaded MPM and PHP as an Apache module.

FastCGI. ;)
The added advantage is that if PHP crashes for whatever reason, your web server isn't affected.

Stuart Herbert Says:

Thanks for the tip. I think the point about having to fit onto more modest amounts of hardware is particularly important - especially over here in the UK, where hosting is still extremely expensive.

Harry Roberts Says:

If you're going to use FastCGI - why not using another server with a much smaller core that's pretty much optimised for handling static content and marshalling requests between the browser & your FastAGI stuff.

Re: PHP crashing Apache, you're only going to take (in Brian's case) a maximum of 25 other clients out at once, it's not ideal - but if you look at it from another perspective you could limit yourself to 3 threads per process and benifit from less memory usage and handle slightly more clients than prefork.

Roland Bouman Says:

Hi! Interesting information! Couple of typos:

"So, a server with 1GB of RAM will only realistically be able to run 50 Apache processes or 50 PHP children for FastCGI if each uses 15GB or RAM."

-> 15MB

"And, these servers only have 2B of RAM."

-> 2GB

Laph Says:

There's another advantage of the FastCGI aproach: You can separate the I/O-Bound Load (serving files) form the CPU/RAM-bound Load (PHP). This kind of setup is very easy and effective with lightys FastCGI implementation.

till Says:

How many requests per seconds handles a single server on your setup?

doughboy Says:

On the above quoted settings, 4 to 5 thousand. We have only seen those numbers by using services like Keynote to stress test our servers. We actually ran out of bandwidth before those servers hit any noticeable limit. Those are dual core opteron boxes with 4GB RAM.

Michael Moody Says:

Could you perhaps post your worker-mpm config? I've had some trouble tuning the variables for threads/clients, etc, coming from a prefork background. My prefork is working fine, but we want to move to worker in some locations, mostly starting with our static content. (If you'd like, you can simply email me the relevant apache configuration directives).

Very much appreciated reading.

Thanks,
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com

doughboy Says:

Our "lightweight" apache servers run this config:

    StartServers 16
    MaxClients 1024
    MinSpareThreads 1024
    MaxSpareThreads 1024
    ThreadsPerChild 64
    MaxRequestsPerChild 0

Each process uses about 90MB of ram. This serves static content and simple PHP code for redireting, and proxying the heavier application servers.

rebsue1 Says:

LOve the Philsbury Doughboy, it's a marketing image that never gets old

doughboy Says:

Thanks Ronald. That is what happens when I blog at 1AM.

doughboy Says:

@Olaf:

Yes, but your PHP memory usage is the problem. Apache prefork without PHP in it is only about 1.1MB per process.

Game Start Says:

Why do not you start a disussion with this guy:
http://neosmart.net/blog/2008/dont-believe-the-lies-php-isnt-thread-safe-yet/

PhYrE Says:

@doughboy:
That is indeed correct that a free does not release the memory to the operating system on Linux, but still contains it to the process. This impact is often mitigated by using the Apache settings to have a max clients served setting, where an apache process will die and be recreated after 1000, 10000, etc requests. That said, it is't perfect, as you are still wasting. You are correct here. The memory is only reused for the same application's requests for memory. On a system that serves largely the same content, the memory fluctuations should not change dramatically. Each request should require similar amounts of memory, served by the portion allocated to the process. The trouble really only comes in when you have a big one-off script that uses plenty of memory. the maxrequestsperchild helps there, and (I don't know why this hasn't been implemented for Apache2) the apache_child_terminate() function in PHP, for a developer to knowingly identify when a big script has run.

You're right on the memory release front, but in a real-world Web situation, most scripts on a given site use very similar memory requirements. Additionally the odds of getting the same type of request again very soon and needing that same memory is high.

-M

Apache Worker et PHP Says:

[...]  Apache Worker and PHP (0 visite) [...]

Installing Apache2 and PHP5 using mod_fcgid | Ivan Says:

[...] now there is FCGI that much better handling CGI applications. Also by using this version of PHP, Apache will be less loaded because it will only handles static elements (html pages, images, css, etc). All PHP requests will [...]

doughboy Says:

@PhYrE:
In my real world, Apache+PHP in prefork mode would use about 14MB or memory per process. 4MB of that was daemon overhead.

With worker, I see a process using 36MB of memory for 25 threads. So, there is a big savings for me and the applications I run.

As I said, you need to test a lot before taking this on. I am confident there is some applications out there that would not work well with Apache+PHP and worker. But, ours run fine.

What can do a Threaded server for a memory hungry Says:

[...] http://doughboy.wordpress.com/2008/02/13/apache-worker-and-php/ Brian Moon explain how switching to a Worker based MPM helped to run dealnews.com smoothly. [...]

Brian Moon Says:

@jmccarrell I just run a ps and look. It is observed behavior. I will do it now..

On my proxy servers, I have 16 processes with 64 threads each. That gives me 1024 children on that server. Each process uses ~85MB or RAM. They have been running for months. So, per process, that is 1.3MB per child. These boxes are typically dual core / dual processor Opterons with 4GB of ram.

On our app servers, we have 10 processes with 25 children each. Each process consumes about 42MB of RAM. Again, running a long time. So, for that server, each child costs us 1.6MB each in RAM. These boxes are typically dual core / dual processor Opterons with 4GB of ram.

This is the only real way to do it as it is the application that will use all the memory on a Linux system. The process start at only about 6MB each on a restart.

This was all done a Linux systems. It is true that there is nothing fancy about threading on Linux. But, this is about memory savings for us. We have plenty of CPU to spare. Especially at the proxy. That application is very low CPU.

FWIW, that is one place I am very impressed with Nginx. In my test of MemProxy ( our proxy application http://code.google.com/p/memproxy/) nginx showed much less cpu than apache with worker. If CPU ever becomes a bottle neck, that may be a move we make.

Brian Moon Says:

What FastCGI does is good and bad. The bad is that your still have per process PHP engines. So, for an app server that does all PHP, you need 100 or so PHP processes. Those take up the same basic amount of memory that the Apache+PHP setup does. If you can get by with only 10 PHP processes with FastCGI, then you likely had your Apache configured wrong. It is not Apache's fault that you had it configured to have 256 MinSpare. That is your fault. The other big thing I hear about is that lighttpd and nginx serve the images straight up. That is true and fine. I ran both on one server and had (mathopd actually) the single process light weight web server serve my images and static content while I had Apache serve my dynamic content.

Of course, FWIW, once you are really high traffic enough to matter, you should be using a CDN. The bandwidth is likely cheaper and you don't have to worry about what process is being wasted serving an image.

IT_Architect Says:

Today, people don't think twice about throwing 8 cores and 4 gigs of ram at a web server. PHP and MySQL dominate the dynamic web space with no challengers in sight. The recent Apache 2.2 and PHP 5.2 builds are very good. Running Apache with the throughput of a static page server is alluring. The question I don't see answered here is if in today's world by moving PHP to FastCGI, the advantages outweigh the detractors. With FastCGI you succeed in moving the processes from Apache to FastCGI. FastCGI has all of the advantages of a persistent process. However, FastCGI is itself a thread of the OS but with it's own far less sophisticated threading system for connection processes. Apache and PHP now need to communicate with each other through a pipe. There is no automatic cleanup like there is with a keep-alive timeouts. With FastCGI, there is risk and complexity, The advantages of FastCGI need to be pretty compelling to consider it.

jmccarrell Says:

Would you care to reveal the OS these numbers were achieved on?

I often hear the assertion that threading on *nix, especially linux, isn't worth the effort. IIRC, Solaris did a good job on their threading implementation. For linuxen that support NPTL, I believe the benefits of threading are measurable, if not significant. Would you agree?

To perhaps answer my previous question about where the memory numbers come from, on our production servers under load, top reports a RES resident set size of approx 25meg with a SHR number of 3468, which seems consistent with your report. Am I guessing right here?

jmccarrell Says:

Thanks for the info on prefork vs. worker MPMs. The real numbers are very helpful to me as I look at what it might mean for us to make a similiar jump to worker, or maybe even the event MPM.
Question: I want to similiarly estimate the memory use of apache + the various modules on my production servers.
How did you arrive at the prefork == 14MB total / 4MB daemon and worker == 36MB numbers?

Comments are disabled for this post.