Re: how does pg handle concurrent queries and same queries - Mailing list pgsql-performance

From Matthew Wakeling
Subject Re: how does pg handle concurrent queries and same queries
Date
Msg-id Pine.LNX.4.64.0807281219300.5954@aragorn.flymine.org
Whole thread Raw
In response to Re: how does pg handle concurrent queries and same queries  (Faludi Gábor <gfaludi@fits.hu>)
List pgsql-performance
On Mon, 28 Jul 2008, Faludi Gábor wrote:
> EXPLAIN ANALYZE SELECT DISTINCT letoltes.cid, count(letoltes.cid)  AS
> elofordulas FROM letoltes GROUP BY cid ORDER BY elofordulas DESC LIMIT 5;
>                                                               QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=9255.05..9255.09 rows=5 width=4) (actual  time=604.734..604.743 > rows=5 loops=1)
>   ->  Unique  (cost=9255.05..9257.26 rows=294 width=4) (actual time=604.732..604.737 rows=5 loops=1)
>         ->  Sort  (cost=9255.05..9255.79 rows=294 width=4) (actual time=604.730..604.732 rows=5 loops=1)
>               Sort Key: count(cid), cid
>               ->  HashAggregate  (cost=9242.26..9243.00 rows=294 width=4) (actual time=604.109..604.417 rows=373
loops=1)
>                     ->  Seq Scan on letoltes  (cost=0.00..6920.51 rows=464351 width=4) (actual time=0.022..281.413
rows=464351loops=1) 
> Total runtime: 604.811 ms

So this query is doing a sequential scan of the letoltes table for each
query. You may get some improvement by creating an index on cid and
clustering on that index, but probably not much.

Moving to Postgres 8.3 will probably help a lot, as it will allow multiple
queries to use the same sequential scan in parallel. That's assuming the
entire table isn't in cache.

Another solution would be to create an additional table that contains the
results of this query, and keep it up to date using triggers on the
original table. Then query that table instead.

However, probably the best solution is to examine the problem and work out
if you can alter the application to make it avoid doing such an expensive
query so often. Perhaps it could cache the results.

Matthew

--
Psychotics are consistently inconsistent. The essence of sanity is to
be inconsistently inconsistent.

pgsql-performance by date:

Previous
From: Craig Ringer
Date:
Subject: Re: how does pg handle concurrent queries and same queries
Next
From: Alvaro Herrera
Date:
Subject: Re: how does pg handle concurrent queries and same queries