Re: CPU bound at 99% - Mailing list pgsql-performance

From PFC
Subject Re: CPU bound at 99%
Date
Msg-id op.t90t5yumcigqcu@apollo13.peufeu.com
Whole thread Raw
In response to CPU bound at 99%  (Bryan Buecking <buecking@gmail.com>)
List pgsql-performance
> about 2300 connections in idle
> (ps auxwww | grep postgres | idle)

    [...]

> The server that connects to the db is an apache server using persistent
> connections. MaxClients is 2048 thus the high number of connections
> needed. Application was written in PHP using the Pear DB class.

    This is pretty classical.
    When your number of threads gets out of control, everything gets slower,
so more requests pile up, spawning more threads, this is positive
feedback, and in seconds all hell breaks loose. That's why I call it
imploding, like if it collapses under its own weight. There is a threshold
effect and it gets from working good to a crawl rather quickly once you
pass the threshold, as you experienced.

    Note that the same applies to Apache, PHP as well as Postgres : there is
a "sweet spot" in the number of threads, for optimum efficiency, depending
on how many cores you have. Too few threads, and it will be waiting for IO
or waiting for the database. Too many threads, and CPU cache utilization
becomes suboptimal and context switches eat your performance.

    This sweet spot is certainly not at 500 connections per core, either for
Postgres or for PHP. It is much lower, about 5-20 depending on your load.

    I will copypaste here an email I wrote to another person with the exact
same problem, and the exact same solution.
    Please read this carefully :

*********************************************************************

Basically there are three classes of websites in my book.
1- Low traffic (ie a few hits/s on dynamic pages), when performance
doesn't matter
2- High traffic (ie 10-100 hits/s on dynamic pages), when you must read
the rest of this email
3- Monster traffic (lots more than that) when you need to give some of
your cash to Akamai, get some load balancers, replicate your databases,
use lots of caching, etc. This is yahoo, flickr, meetic, etc.

Usually people whose web sites are imploding under load think they are in
class 3 but really most of them are in class 2 but using inadequate
technical solutions like MySQL, etc. I had a website with 200K members
that ran on a Celeron 1200 with 512 MB RAM, perfectly fine, and lighttpd
wasn't even visible in the top.

Good news for you is that the solution to your problem is pretty easy. You
should be able to solve that in about 4 hours.

Suppose you have some web servers for static content ; obviously you are
using lighttpd on that since it can service an "unlimited" (up to the OS
limit, something like 64K sockets) number of concurrent connections. You
could also use nginx or Zeus. I think Akamai uses Zeus. But Lighttpd is
perfectly fine (and free). For your static content servers you will want
to use lots of RAM for caching, if you serve images, put the small files
like thumbnails, css, javascript, html pages on a separate server so that
they are all served from RAM, use a cheap CPU since a Pentium-M  with
lighttpd will happily push 10K http hits/s if you don't wait for IO. Large
files should be on the second static server to avoid cache trashing on the
server which has all the frequently accessed small files.

Then you have some web servers for generating your dynamic content. Let's
suppose you have N CPU cores total.
With your N cores, the ideal number of threads would be N. However those
will also wait for IO and database operations, so you want to fill those
wait times with useful work, so maybe you will use something like 2...10
threads per core. This can only be determined by experimentation, it
depends on the type and length of your SQL queries so there is no "one
size fits all" answer.

Example. You have pages that take 20 ms to generate, and you have 100
requests for those coming up. Let's suppose you have one CPU core.

(Note : if your pages take longer than 10 ms, you have a problem. On the
previously mentioned website, now running on the cheapest Core 2 we could
find since the torrent tracker eats lots of CPU, pages take about 2-5 ms
to generate, even the forum pages with 30 posts on them. We use PHP with
compiled code caching and SQL is properly optimized). And, yes, it uses
MySQL. Once I wrote (as an experiment) an extremely simple forum which did
1400 pages/second (which is huge) with a desktop Core2 as the Postgres 8.2
server.

- You could use Apache in the old fasion way, have 100 threads, so all
your pages will take 20 ms x 100 = 2 seconds,
But the CPU cache utilisation will suck because of all those context
switches, you'll have 100 processes eating your RAM (count 8MB for a PHP
process), 100 database connections, 100 postgres processes, the locks will
stay on longer, transactions will last longer, you'll get more dead rows
to vacuum, etc.
And actually, since Apache will not buffer the output of your scripts, the
PHP or Perl interpreter will stay in memory (and hog a database
connection) until the client at the other end of the internets had loaded
all the data. If the guy has DSL, this can take 0.5 seconds, if he has
56K, much longer. So, you are likely to get much more than 100 processes
in your Apache, perhaps 150 or perhaps even 1000 if you are out of luck.
In this case the site usually implodes.

- You could have a lighttpd or squid proxy handling the client
connections, then funnelling that to a few threads generating the
webpages. Then, you don't care anymore about the slowness of the clients
because they are not hogging threads anymore. If you have 4 threads, your
requests will be processed in order, first come first served, 20 ms x 4 =
80 ms each average, the CPU cache will work better since you'll get much
less context switching, RAM will not be filled, postgres will be happy.

> So, the front-end proxy would have a number of max connections, say 200,

Number of connections to clients => don't set any values, sockets are free
in lighttpd.
Number of connections to PHP/fastcgi or apache/mod_perl backends => number
of cores x 2 to 5, adjust to taste

>   and it would connect to another httpd/mod_perl server behind with a
> lower number of connections, say 20. If the backend httpd server was
> busy, the proxy connection to it would just wait in a queue until it was
> available.

     Yes, it waits in a queue.

> Is that the kind of design you had in mind?

     Yes.
     The two key points are that :
     * Perl/PHP processes and their heavy resources (database connections,
RAM) are used only when they have work to do and not waiting for the
client.
     * The proxy must work this way :
         1- get and buffer request data from client (slow, up to 500 ms, up
to 2000 ms if user has emule or other crap hogging his upload)
         2- send request to backend (fast, on your LAN, < 1 ms)
         3- backend generates HTML and sends it to proxy (fast, LAN), proxy
buffers data
         4- backend is now free to process another request
         5- proxy sends buffered data to client (slow, up to 100-3000 ms)
     The slow parts (points 1 and 5) do not hog a perl/PHP backend.

     Do not use a transparent proxy ! The proxy must buffer requests and
data for this to work. Backends must never wait for the client. Lighttpd
will buffer everything, I believe Apache can be configured to do so. But I
prefer to use lighttpd for proxying, it is faster and the queuing works
better.

     Also, if you can use FastCGI, use it. I have never used mod_perl, but
with mod_php, you have a fixed startup cost every time a PHP interpreter
starts. With fastcgi, a number of PHP interpreter threads are spawned at
startup, so they are always ready, the startup cost is much smaller. You
can serve a small AJAX request with 1-2 database queries in less than 1 ms
if you are careful with your code (like, no heavyweight session
initialization on each page, using mmcache to avoid reparsing the PHP
everytime, etc).

     If you have several backend servers generating webpages, use sticky
sessions and put the session storage on the backends themselves, if you
use files use ReiserFS not ext3 which sucks when you have a large number
of session files in the same directory. Or use memcached, whatever, but
don't put sessions in the database, this gives you a nice tight bottleneck
when adding servers. If each and every one of your pages has an UPDATE
query to the sessions table you have a problem.

     As for why I like lighttpd, I am fond of the asynchronous select/poll
model for a webserver which needs to handle lots of concurrent
connections. When you have 50 open sockets threads are perfectly fine,
when you have 1000 a threaded server will implode. I wrote a bittorrent
tracker in Python using an asynchronous select/poll model ; it has been
handling about 150-400 HTTP hits per second for two years now, it has
about 100-200 concurrent opened sockets 24 hours a day, and the average
lifetime of a socket connection is 600 ms. There are 3 threads (webserver,
backend, deferred database operations) with some queues in between for the
plumbing. Serving an /announce HTTP request takes 350 microseconds of CPU
time. All using a purely interpreted language, lol. It uses half a core on
the Core 2 and about 40 MB of RAM.

    When lighttpd is overloaded (well, it's impossible to kill it with static
files unless it waits for disk IO, but if you overload the fastcgi
processes), requests are kicked out of the queue, so for instance it will
only serve 50% of the requests. But an overloaded apache will serve 0%
since you'll get 1000 threads, it'll swap, and everything will timeout and
crash.

********************************************************

End of copypaste.

    So :

    - You need to get less Postgres connections to let Postgres breathe and
use your CPU power to perform queries and not context switches and cache
management.
    - You need to get less PHP threads which will have the same effect on
your webserver.

    The way to do this is is actually pretty simple.

    - Frontend proxy (lighttpd), load balancer, whatever, sending static
requests to static servers, and dynamic requests to dynamic servers. If
the total size of your static files fits in the RAM of this server, make
the static server and the proxy the same lighttpd instance.

    - Backends for PHP : a number of servers running PHP/fastcgi, no web
servers at all, the lighttpd frontend can hit several PHP/fastcgi backends.

    - Use PHP persistent connections (which now appear to work in the latest
version, in fastcgi mode, I don't know about mod_php's persistent
connections though).
    - Or use pgpool or pgbouncer or another connection pooler, but only if
PHP's persistent connections do not work for you.

> 1:  Each apache / php process maintains its own connections, not
> sharing with others.  So it's NOT connection pooling, but people tend
> to think it is.

    True with mod_php (and sad). With fastcgi, you don't really care, since
the PHP processes are few and are active most of the time, no connection
hogging takes place unless you use many different users to connect to
postgres, in which case you should switch to pgpool.

> 2:  Each unique connection creates another persistent connection for
> an apache/php child process.  If you routinely connect to multiple
> servers / databases or as > 1 user, then each one of those
> combinations that is unique makes another persistent connection.

    True also for fastcgi, but if you don't do that, no problem.

> 3:  There's no facility in PHP to clean an old connection out and make
> sure it's in some kind of consistent state when you get it.  It's in
> exactly the same state it was when the previous php script finished
> with it.  Half completed transactions, partial sql statements,
> sequence functions like currval() may have values that don't apply to
> you.

    Apparently now fixed.

> 4:  pg_close can't close a persistent connection.  Once it's open, it
> stays open until the child process is harvested.

    Don't know about that.

> 5:  Apache, by default, is configured for 150 child processes.
> Postgresql, and many other databases for that matter, are configured
> for 100 or less.

    (and for good reason)

> Even if apache only opens one connection to one
> database with one user account, it will eventually try to open the
> 101st connection to postgresql and fail.  So, the default
> configuration of apache / postgresql for number of connections is
> unsafe for pconnect.

    fastcgi makes this problem disappear by separating the concept of "client
connection" from the concept of "web server thread". Not only will it make
Postgres happier, your PHP processing will be much faster too.

> 6:  The reason for connection pooling is primarily to twofold.  One is
> to allow very fast connections to your database when doing lots of
> small things where connection time will cost too much.  The other is
> to prevent your database from having lots of stale / idle connections
> that cause it to waste memory and to be slower since each backend
> needs to communicate with every other backend some amount of data some
> times.  pconnect takes care of the first problem, but exacerbates the
> second.

    Moot point with fastcgi.
    Unused PHP processes are removed in times of low traffic, along with
their connections.

> P.s. dont' think I'm dogging PHP, cause I'm not.  I use it all the
> time, and it's really great for simple small scripts that need to be
> done NOW and need to be lightweight.  I even use pconnect a bit.  But
> my machine is set for 50 or fewer apache children and 150 postgresql
> connects, and I only use pconnect on small, lightweight things that
> need to zoom.  Everything else gets regular old connect.

    Very true for mod_php, wrong for fastcgi : you can get extreme
performance with pconnect and a PHP code cache like turck/mm or
eaccelerator, down to 1 ms per page.

    Especially if you use PEAR which is very bloated, you nead a code cache
to avoid parsing it on every page.
    On previously mentioned website it cut the page time from 50 ms to 2 ms
on some pages because there was a lot of includes.









pgsql-performance by date:

Previous
From: Craig Ringer
Date:
Subject: Re: CPU bound at 99%
Next
From: Rafael Barrera Oro
Date:
Subject: Suspicious top output