Bryan Field-Elliot <bryan_lists@netmeme.org> writes:
> We have this simple query:
>
> select status, count(*) from customer group by status;
>
> There is already a btree index on status, but, the customer table is
> huge, and this query must be executed very frequently... an "explain" on
> this query shows that it is quite costly (and we notice it runs
> slowly)...
>
> Can someone recommend the best technique to optimize this? We can create
> new indices, we can re-write this query.. But we'd rather not add new
> tables or columns if possible (not just to solve this problem).
Without creating a new table you have three options, none of which will be
very fast:
1) If the entire table fits in ram then you could try setting random_page_cost
close enough to 1 to cause the index to be used.
2) If it doesn't but there are a reasonably small number of distinct values of
status you would hope to see a sequential scan and a hash aggregate being
used. I would expect this to be what you would see with the default
configuration. If not you might have to increase work_mem (or sort_mem
depending on your version I think).
3) If you have a large table with a large set of status values then Postgres
may be compelled to resort the entire table. In which case you should
experiment with work_mem/sort_mem to get the largest value you can without
inducing swapping. You could also experiment with putting pgsql_tmp on
separate spindles.
None of these will be fast enough for OLTP access like a web page. You would
be best off performing this query periodically and stuffing the results in a
table for quick access.
--
greg