Home > mailing lists

Re: Performance Issues with count() - Mailing list pgsql-general

From	Mark kirkwood
Subject	Re: Performance Issues with count()
Date	April 26, 2002 04:44:02
Msg-id	1019810377.1267.89.camel@spikey.slithery.org Whole thread Raw
In response to	Performance Issues with count() ("asdf asdasfa" <sjg@email.com>)
List	pgsql-general

Tree view

I wonder if using a summary table could help you here...

for example :

suppose you have a table - big_table(id,...,interesting_value)

and you want to provide counts of interesting_value quickly.

The plan is to create another table -
summ_table(interesting_value,...,its_count)

and keep it up to date with big_table via triggers.

Then any query of the form :

SELECT interesting_value, count(*)
FROM big_table
WHERE ...
GROUP BY interesting value;

can be answered by :

SELECT interesting_value,its_count
FROM summ_table
WHERE ...;

which is generally *much* faster.
(clearly a complete count is easy to answer quickly too...)

I must say that I have not used this technique in Postgresql (I have
used it in DB2 and Oracle). However this issue is typical for large
databases of all flavours (i.e its too slow to scan and count a lot of
values for each query.... so you do it once and save the results for
future reference).

In addition the rule system in Postgresql *might* be ameniable to
providing a sort of 'query rewrite' to automatically make (some) queries
on big_table go to summ_table instead...(have not tried this... but you
never know until you try)

best wishes

Mark

P.s : the trigger code to keep big_table and summ_table in sync is the
hard bit... but there are no doubt many folks on this list ...incl even
me.. who will happily help you out here.

pgsql-general by date:

From: Francois Suter
Date: 26 April 2002, 02:55:04
Subject: Re: pid gets overwritten in OSX

From: "Joseph Artsimovich"
Date: 26 April 2002, 05:50:16
Subject: bit datatype in 7.2.1

Re: Performance Issues with count() - Mailing list pgsql-general

Previous

Next