Re: How to optimize select count(*)..group by? - Mailing list pgsql-general

From Greg Stark
Subject Re: How to optimize select count(*)..group by?
Date
Msg-id 87fyty977n.fsf@stark.xeocode.com
Whole thread Raw
In response to How to optimize select count(*)..group by?  (Bryan Field-Elliot <bryan_lists@netmeme.org>)
List pgsql-general
Bryan Field-Elliot <bryan_lists@netmeme.org> writes:

> We have this simple query:
>
> select status, count(*) from customer group by status;
>
> There is already a btree index on status, but, the customer table is
> huge, and this query must be executed very frequently... an "explain" on
> this query shows that it is quite costly (and we notice it runs
> slowly)...
>
> Can someone recommend the best technique to optimize this? We can create
> new indices, we can re-write this query.. But we'd rather not add new
> tables or columns if possible (not just to solve this problem).

Without creating a new table you have three options, none of which will be
very fast:

1) If the entire table fits in ram then you could try setting random_page_cost
   close enough to 1 to cause the index to be used.

2) If it doesn't but there are a reasonably small number of distinct values of
   status you would hope to see a sequential scan and a hash aggregate being
   used. I would expect this to be what you would see with the default
   configuration. If not you might have to increase work_mem (or sort_mem
   depending on your version I think).

3) If you have a large table with a large set of status values then Postgres
   may be compelled to resort the entire table. In which case you should
   experiment with work_mem/sort_mem to get the largest value you can without
   inducing swapping. You could also experiment with putting pgsql_tmp on
   separate spindles.

None of these will be fast enough for OLTP access like a web page. You would
be best off performing this query periodically and stuffing the results in a
table for quick access.



--
greg

pgsql-general by date:

Previous
From: Jaime Casanova
Date:
Subject: Fwd: Trigger ad mutli database
Next
From: Scott Marlowe
Date:
Subject: Re: How to optimize select count(*)..group by?