Re: How to handle missing pgsql.connect_timeout - Mailing list pgsql-php
From | Marco Colombo |
---|---|
Subject | Re: How to handle missing pgsql.connect_timeout |
Date | |
Msg-id | Pine.LNX.4.61.0502081224510.5203@Megathlon.ESI Whole thread Raw |
In response to | Re: How to handle missing pgsql.connect_timeout (Christopher Murtagh <christopher.murtagh@mcgill.ca>) |
List | pgsql-php |
On Wed, 2 Feb 2005, Christopher Murtagh wrote: > On Wed, 2005-02-02 at 09:49 +0100, Janning Vygen wrote: >> It's not broken anymore! And it speeds up your website. > > Well it might not be broken anymore, but I haven't seen any evidence of > a speed increase (quite the opposite in fact). What we did instead was > write a very small abstraction layer so that any page request would only > use 1 database connection. We're running on dual 2.8 GHz Xeons for our > web server and db machine and taking on over 500,000 hits per day. Some > of our pages are very data intensive (http://www.mcgill.ca/calendar/ for > example). > >> At the moment it waits "forever" (until max_execution_time i guess) and you >> have no chance to show an error message like: DB is overloaded, please try >> again. > > maybe pg_pconnect() isn't all it is cracked up to be? Anyone else here > have good experiences with it? You have to evaluate carefully your setup. I think there's little hope to plan this in advance, you'll have to monitor your system at runtime. I've been using persistent connections a lot. First, think of what kind of problem you're trying to solve... decrease the connection overhead. One connection made to the DB, per page. Now consider the following: - PostgreSQL forks a backend process for each connection: this is almost universally considered a huge overhead. But on modern operating systems, this is no longer the case. - network latency: if the PG server runs on a different host, it may be an issue. If it runs on the same host, and is accessed via Unix sockets, latency is negligible. - in order to implement connect semantic, PHP has to 'reset' the session, which is a kind of overhead per se. Consider the same apache backend serving two different PG users! The connection may be the same, but all local and remote state has to be removed. - if your pages use multiple db connections, you'll get many PG backends per apache backend. - each PG backend uses up some resources (RAM mostly): there are a number of 'knobs' you can turn to control that, but of course, unless your RAM in infinite, you may have to _reduce_ them in order to increase the number of concurrent backends. To put it very simply (maybe too simply) it's a CPU vs RAM tradeoff. Using persistent connections turns a little (very OS and setup dependant) CPU overhead into a not so little RAM overhead on the PG server side. I wasn't able to find a definitive answer. I have many concurrent accesses (say 200-300), with simple (read and write) queries, but on a large data set. I have two apache frontends, and one PG server (same LAN). I wasn't able to choose between pconnect or connect. At times, some PHP programmer kills the DB, and I have to teach him not to perform 300 different queries _per page_ with a PHP for loop and to learn how to use table joins instead. I can tell you that when that happens, you'll forget about pconnect or connect, and thank PG developers for writing such a robust application (I've seen load averages up to 100, still PG was doing its job, even if very slowly). :-) .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it