Re: how does pg handle concurrent queries and same queries - Mailing list pgsql-performance

From Faludi Gábor
Subject Re: how does pg handle concurrent queries and same queries
Date
Msg-id 001101c8f087$037f1350$0a7d39f0$@hu
Whole thread Raw
In response to Re: how does pg handle concurrent queries and same queries  ("Claus Guttesen" <kometen@gmail.com>)
Responses Re: how does pg handle concurrent queries and same queries
List pgsql-performance
Hi,

here is what the original query was which was obviously nonsense :
EXPLAIN ANALYZE SELECT DISTINCT letoltes.cid, s.elofordulas FROM letoltes
INNER JOIN (select letoltes.cid, count(letoltes.cid) AS elofordulas FROM
letoltes GROUP BY cid) s ON s.cid=letoltes.cid ORDER BY s.elofordulas DESC
LIMIT 5;

QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
------------
 Limit  (cost=73945.35..73945.65 rows=5 width=12) (actual
time=4191.396..4351.966 rows=5 loops=1)
   ->  Unique  (cost=73945.35..77427.99 rows=58800 width=12) (actual
time=4191.390..4351.956 rows=5 loops=1)
         ->  Sort  (cost=73945.35..75106.23 rows=464351 width=12) (actual
time=4191.386..4283.545 rows=175944 loops=1)
               Sort Key: s.elofordulas, letoltes.cid
               ->  Merge Join  (cost=9257.99..30238.65 rows=464351 width=12)
(actual time=652.535..2920.304 rows=464351 loops=1)
                     Merge Cond: ("outer".cid = "inner".cid)
                     ->  Index Scan using idx_letoltes_cid on letoltes
(cost=0.00..12854.51 rows=464351 width=4) (actual time=0.084..1270.588
rows=464351 loops=1)
                     ->  Sort  (cost=9257.99..9258.73 rows=294 width=12)
(actual time=652.434..810.941 rows=464176 loops=1)
                           Sort Key: s.cid
                           ->  Subquery Scan s  (cost=9242.26..9245.94
rows=294 width=12) (actual time=651.343..652.028 rows=373 loops=1)
                                 ->  HashAggregate  (cost=9242.26..9243.00
rows=294 width=4) (actual time=651.339..651.661 rows=373 loops=1)
                                       ->  Seq Scan on letoltes
(cost=0.00..6920.51 rows=464351 width=4) (actual time=0.014..307.469
rows=464351 loops=1)
 Total runtime: 4708.434 ms
(13 sor)

However after fixing the query this is 1/4 th of the time  but still blocks
the site :


EXPLAIN ANALYZE SELECT DISTINCT letoltes.cid, count(letoltes.cid)  AS
elofordulas FROM letoltes GROUP BY cid ORDER BY elofordulas DESC LIMIT 5;
                                                               QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------------------------
 Limit  (cost=9255.05..9255.09 rows=5 width=4) (actual time=604.734..604.743
rows=5 loops=1)
   ->  Unique  (cost=9255.05..9257.26 rows=294 width=4) (actual
time=604.732..604.737 rows=5 loops=1)
         ->  Sort  (cost=9255.05..9255.79 rows=294 width=4) (actual
time=604.730..604.732 rows=5 loops=1)
               Sort Key: count(cid), cid
               ->  HashAggregate  (cost=9242.26..9243.00 rows=294 width=4)
(actual time=604.109..604.417 rows=373 loops=1)
                     ->  Seq Scan on letoltes  (cost=0.00..6920.51
rows=464351 width=4) (actual time=0.022..281.413 rows=464351 loops=1)
 Total runtime: 604.811 ms


here is the table :
\d letoltes
                     TĂĄbla "public.letoltes"
 Oszlop | TĂ­pus  |                  MĂłdosĂ­tĂł
--------+---------+------------------------------------------------
 id     | integer | not null default nextval('letoltes_seq'::text)
 cid    | integer |
Indexes:
    "idx_letoltes_cid" btree (cid)
    "idx_letoltes_id" btree (id)

select count(1) from letoltes;
 count
--------
 464351


VACUM ANALYZE runs overnight every day.

thanks,
Gabor

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Claus Guttesen
Sent: Monday, July 28, 2008 8:56 AM
To: Faludi Gábor
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] how does pg handle concurrent queries and same
queries

> I have taken over the maintenance of a server farm , recently. 2 webserver
> on  db server. They are quite powerful 2 processor xeon w/ 6Gig of ram .
>
> Couple of days ago we had a serious performance hit and the db server (pg.
> v7.4) was overloaded w/ something in a way that operating system was
almost
> not able to respond or in cases it did not.
>
> After some analysis i suspect that there is a query that takes up to 1
> second and that is the cause. Upon  each page loading this query fires and
> takes the one second and blocks the page to load completly . The load was
> roughly ~300 connections in one minute .
>
> So here are my questions :
>
> .         Why does the second and the later queries take the whole on
second
> if the dataset is the same . Shouldn't PG realise that the query is the
same
> so i give the user the same resultset ?
>
> .         How do I know if one query blocks the other ?
>
> .         Is there a way to log the long running queries in 7.4 ? If not
is
> it available in any newer version ?

Can you post the queries? Can you provide an 'analyze explain'? Do you
perform a 'vacuum analyze' on a regular basis?

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.138 / Virus Database: 270.5.6/1576 - Release Date: 2008.07.27.
16:16


pgsql-performance by date:

Previous
From: "Claus Guttesen"
Date:
Subject: Re: how does pg handle concurrent queries and same queries
Next
From: Craig Ringer
Date:
Subject: Re: how does pg handle concurrent queries and same queries