Re: Performance Bottleneck - Mailing list pgsql-performance
From | Alex Hayward |
---|---|
Subject | Re: Performance Bottleneck |
Date | |
Msg-id | Pine.LNX.4.58.0408101521200.423@sphinx.mythic-beasts.com Whole thread Raw |
In response to | Re: Performance Bottleneck ("Matt Clark" <matt@ymogen.net>) |
Responses |
Re: Performance Bottleneck
|
List | pgsql-performance |
On Sun, 8 Aug 2004, Matt Clark wrote: > > And this is exactly where the pgpool advantage lies. > > Especially with the > > TPC-W, the Apache is serving a mix of PHP (or whatever CGI > > technique is > > used) and static content like images. Since the 200+ Apache > > kids serve > > any of that content by random and the emulated browsers very much > > encourage it to ramp up MaxClients children by using up to 4 > > concurrent > > image connections, one does end up with MaxClients DB > > connections that > > are all relatively low frequently used. In contrast to that the real > > pgpool causes lesser, more active DB connections, which is better for > > performance. > > There are two well-worn and very mature techniques for dealing with the > issue of web apps using one DB connection per apache process, both of which > work extremely well and attack the issue at its source. > > 1) Use a front-end caching proxy like Squid as an accelerator. Static > content will be served by the accelerator 99% of the time. Additionally, > large pages can be served immediately to the accelerator by Apache, which > can then go on to serve another request without waiting for the end user's > dial-up connection to pull the data down. Massive speedup, fewer apache > processes needed. Squid also takes away the work of doing SSL (presuming you're running it on a different machine). Unfortunately it doesn't support HTTP/1.1 which means that most generated pages (those that don't set Content-length) end up forcing squid to close and then reopen the connection to the web server. Because you no longer need to worry about keeping Apache processes around to dribble data to people on the wrong end of modems you can reduce MaxClients quite a bit (to, say, 10 or 20 per web server). This keeps the number of PostgreSQL connections down. I'd guess that above some point you're going to reduce performance by increasing MaxClients and running queries in parallel rather than queueing the request and doing them serially. I've also had some problems when Squid had a large number of connections open (several thousand); though that may have been because of my half_closed_clients setting. Squid 3 coped a lot better when I tried it (quite a few months ago now - and using FreeBSD and the special kqueue system call) but crashed under some (admittedly synthetic) conditions. > I'm sure pgpool and the like have their place, but being band-aids for > poorly configured websites probably isn't the best use for them. You still have periods of time when the web servers are busy using their CPUs to generate HTML rather than waiting for database queries. This is especially true if you cache a lot of data somewhere on the web servers themselves (which, in my experience, reduces the database load a great deal). If you REALLY need to reduce the number of connections (because you have a large number of web servers doing a lot of computation, say) then it might still be useful.
pgsql-performance by date: