Thread: how does pg handle concurrent queries and same queries
Hi All,
I have taken over the maintenance of a server farm , recently. 2 webserver on db server. They are quite powerful 2 processor xeon w/ 6Gig of ram .
Couple of days ago we had a serious performance hit and the db server (pg. v7.4) was overloaded w/ something in a way that operating system was almost not able to respond or in cases it did not.
After some analysis i suspect that there is a query that takes up to 1 second and that is the cause. Upon each page loading this query fires and takes the one second and blocks the page to load completly . The load was roughly ~300 connections in one minute .
So here are my questions :
· Why does the second and the later queries take the whole on second if the dataset is the same . Shouldn’t PG realise that the query is the same so i give the user the same resultset ?
· How do I know if one query blocks the other ?
· Is there a way to log the long running queries in 7.4 ? If not is it available in any newer version ?
thanks for your help !
Üdvözlettel/kind regards,
Faludi, Gábor
FITS Magyarország Kft.
Tel.:+36 30 4945862
Email: falu@fits.hu
Ingyenes videó tanfolyamok(Excel,Access,Word) : http://www.fits.hu/trainings
> I have taken over the maintenance of a server farm , recently. 2 webserver > on db server. They are quite powerful 2 processor xeon w/ 6Gig of ram . > > Couple of days ago we had a serious performance hit and the db server (pg. > v7.4) was overloaded w/ something in a way that operating system was almost > not able to respond or in cases it did not. > > After some analysis i suspect that there is a query that takes up to 1 > second and that is the cause. Upon each page loading this query fires and > takes the one second and blocks the page to load completly . The load was > roughly ~300 connections in one minute . > > So here are my questions : > > · Why does the second and the later queries take the whole on second > if the dataset is the same . Shouldn't PG realise that the query is the same > so i give the user the same resultset ? > > · How do I know if one query blocks the other ? > > · Is there a way to log the long running queries in 7.4 ? If not is > it available in any newer version ? Can you post the queries? Can you provide an 'analyze explain'? Do you perform a 'vacuum analyze' on a regular basis? -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
Hi, here is what the original query was which was obviously nonsense : EXPLAIN ANALYZE SELECT DISTINCT letoltes.cid, s.elofordulas FROM letoltes INNER JOIN (select letoltes.cid, count(letoltes.cid) AS elofordulas FROM letoltes GROUP BY cid) s ON s.cid=letoltes.cid ORDER BY s.elofordulas DESC LIMIT 5; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------------ Limit (cost=73945.35..73945.65 rows=5 width=12) (actual time=4191.396..4351.966 rows=5 loops=1) -> Unique (cost=73945.35..77427.99 rows=58800 width=12) (actual time=4191.390..4351.956 rows=5 loops=1) -> Sort (cost=73945.35..75106.23 rows=464351 width=12) (actual time=4191.386..4283.545 rows=175944 loops=1) Sort Key: s.elofordulas, letoltes.cid -> Merge Join (cost=9257.99..30238.65 rows=464351 width=12) (actual time=652.535..2920.304 rows=464351 loops=1) Merge Cond: ("outer".cid = "inner".cid) -> Index Scan using idx_letoltes_cid on letoltes (cost=0.00..12854.51 rows=464351 width=4) (actual time=0.084..1270.588 rows=464351 loops=1) -> Sort (cost=9257.99..9258.73 rows=294 width=12) (actual time=652.434..810.941 rows=464176 loops=1) Sort Key: s.cid -> Subquery Scan s (cost=9242.26..9245.94 rows=294 width=12) (actual time=651.343..652.028 rows=373 loops=1) -> HashAggregate (cost=9242.26..9243.00 rows=294 width=4) (actual time=651.339..651.661 rows=373 loops=1) -> Seq Scan on letoltes (cost=0.00..6920.51 rows=464351 width=4) (actual time=0.014..307.469 rows=464351 loops=1) Total runtime: 4708.434 ms (13 sor) However after fixing the query this is 1/4 th of the time but still blocks the site : EXPLAIN ANALYZE SELECT DISTINCT letoltes.cid, count(letoltes.cid) AS elofordulas FROM letoltes GROUP BY cid ORDER BY elofordulas DESC LIMIT 5; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------------- Limit (cost=9255.05..9255.09 rows=5 width=4) (actual time=604.734..604.743 rows=5 loops=1) -> Unique (cost=9255.05..9257.26 rows=294 width=4) (actual time=604.732..604.737 rows=5 loops=1) -> Sort (cost=9255.05..9255.79 rows=294 width=4) (actual time=604.730..604.732 rows=5 loops=1) Sort Key: count(cid), cid -> HashAggregate (cost=9242.26..9243.00 rows=294 width=4) (actual time=604.109..604.417 rows=373 loops=1) -> Seq Scan on letoltes (cost=0.00..6920.51 rows=464351 width=4) (actual time=0.022..281.413 rows=464351 loops=1) Total runtime: 604.811 ms here is the table : \d letoltes TĂĄbla "public.letoltes" Oszlop | TĂpus | MĂłdosĂtĂł --------+---------+------------------------------------------------ id | integer | not null default nextval('letoltes_seq'::text) cid | integer | Indexes: "idx_letoltes_cid" btree (cid) "idx_letoltes_id" btree (id) select count(1) from letoltes; count -------- 464351 VACUM ANALYZE runs overnight every day. thanks, Gabor -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Claus Guttesen Sent: Monday, July 28, 2008 8:56 AM To: Faludi Gábor Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] how does pg handle concurrent queries and same queries > I have taken over the maintenance of a server farm , recently. 2 webserver > on db server. They are quite powerful 2 processor xeon w/ 6Gig of ram . > > Couple of days ago we had a serious performance hit and the db server (pg. > v7.4) was overloaded w/ something in a way that operating system was almost > not able to respond or in cases it did not. > > After some analysis i suspect that there is a query that takes up to 1 > second and that is the cause. Upon each page loading this query fires and > takes the one second and blocks the page to load completly . The load was > roughly ~300 connections in one minute . > > So here are my questions : > > . Why does the second and the later queries take the whole on second > if the dataset is the same . Shouldn't PG realise that the query is the same > so i give the user the same resultset ? > > . How do I know if one query blocks the other ? > > . Is there a way to log the long running queries in 7.4 ? If not is > it available in any newer version ? Can you post the queries? Can you provide an 'analyze explain'? Do you perform a 'vacuum analyze' on a regular basis? -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance No virus found in this incoming message. Checked by AVG - http://www.avg.com Version: 8.0.138 / Virus Database: 270.5.6/1576 - Release Date: 2008.07.27. 16:16
Faludi Gábor wrote: > . Why does the second and the later queries take the whole on second > if the dataset is the same . Shouldn't PG realise that the query is the same > so i give the user the same resultset ? That would require a result cache. I don't know if Pg even has a query result cache - I don't think so, but I'm not sure. Even if it does, it'd still only be useful if the queries were issued under *exactly* the same conditions - in other words, no writes had been made to the database since the cached query was issued, and the first query had committed before the second began (or was read-only). Additionally, no volatile functions could be called in the query, because their values/effects might be different when the query is executed a second time. That includes triggers, etc. Since 7.4 doesn't do lazy xid allocation it can't really tell that nothing has been changed since the previous query was cached. So, if I'm not missing something here, a query result cache would be useless anyway. > . How do I know if one query blocks the other ? Examination of pg_catalog.pg_locks is certainly a start. It's trickier with lots of short-running queries, though. > . Is there a way to log the long running queries in 7.4 ? If not is > it available in any newer version ? It's certainly available in 8.3, as log_min_duration_statement in postgresql.conf . You can find out if it's in 7.4, and if not what version it was introduced in, by looking through the documentation for versions 7.4 and up. -- Craig Ringer
On Mon, 28 Jul 2008, Faludi Gábor wrote: > EXPLAIN ANALYZE SELECT DISTINCT letoltes.cid, count(letoltes.cid) AS > elofordulas FROM letoltes GROUP BY cid ORDER BY elofordulas DESC LIMIT 5; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=9255.05..9255.09 rows=5 width=4) (actual time=604.734..604.743 > rows=5 loops=1) > -> Unique (cost=9255.05..9257.26 rows=294 width=4) (actual time=604.732..604.737 rows=5 loops=1) > -> Sort (cost=9255.05..9255.79 rows=294 width=4) (actual time=604.730..604.732 rows=5 loops=1) > Sort Key: count(cid), cid > -> HashAggregate (cost=9242.26..9243.00 rows=294 width=4) (actual time=604.109..604.417 rows=373 loops=1) > -> Seq Scan on letoltes (cost=0.00..6920.51 rows=464351 width=4) (actual time=0.022..281.413 rows=464351loops=1) > Total runtime: 604.811 ms So this query is doing a sequential scan of the letoltes table for each query. You may get some improvement by creating an index on cid and clustering on that index, but probably not much. Moving to Postgres 8.3 will probably help a lot, as it will allow multiple queries to use the same sequential scan in parallel. That's assuming the entire table isn't in cache. Another solution would be to create an additional table that contains the results of this query, and keep it up to date using triggers on the original table. Then query that table instead. However, probably the best solution is to examine the problem and work out if you can alter the application to make it avoid doing such an expensive query so often. Perhaps it could cache the results. Matthew -- Psychotics are consistently inconsistent. The essence of sanity is to be inconsistently inconsistent.
Craig Ringer wrote: > Faludi Gábor wrote: > > > . Why does the second and the later queries take the whole on second > > if the dataset is the same . Shouldn't PG realise that the query is the same > > so i give the user the same resultset ? > > That would require a result cache. I don't know if Pg even has a query > result cache - I don't think so, but I'm not sure. It doesn't. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Slightly off-topic, but judging from the fact that you were able to "fix" the query, it seems you have some way to modify the application code itself. In that case, I'd try to implement caching (at least for this statement) on the application side, for example with memcached.