Home > mailing lists

Re: Much Ado About COUNT(*) - Mailing list pgsql-hackers

From	Merlin Moncure
Subject	Re: Much Ado About COUNT(*)
Date	January 12, 2005 18:54:52
Msg-id	6EE64EF3AB31D5448D0007DD34EEB3412A75AA@Herge.rcsinc.local Whole thread Raw
In response to	Much Ado About COUNT(*) ("Jonah H. Harris" <jharris@tvi.edu>)
List	pgsql-hackers

Tree view

> Tom, Bruce, and others involved in this recurring TODO discussion...
>
> First, let me start by saying that I understand this has been
discussed
> many times before; however, I'd like to see what the current state of
> affairs is regarding the possibility of using a unique index scan to
> speed up the COUNT aggregate.

To sum up:
1.  There are good technical reasons why not to do this.  The pg
aggregate system is very elegant...not worth compromising it for a
specific case.
2.  postgresql can do many things faster than oracle.  If you prefer the
way oracle behaves, use oracle.
3.  workaround #1: just run analyze once in a while (you should do that
anyways) and query pg_Class for the #tuples in a relation.
4.  workaround #2: rig up a materialized view and query that.  This will
be faster than what oracle does, btw, at the price of some coherency.
5.  understand that count(*) from t, although frequently used, is of
dubious value in the general sense.  Sooner or later someone will
optimize this, but in light of the currently available workarounds it
doesn't seem that important.
6.  for large tables, you can get a pretty accurate count by doing:
select count(*) * 10 from t where random() > .9;
on my setup, this shaved about 15% off of the counting time...YMMV.

Merlin

pgsql-hackers by date:

From: Reinhard Max
Date: 12 January 2005, 18:37:52
Subject: segfault caused by heimdal (was: SUSE port)

From: "Jonah H. Harris"
Date: 12 January 2005, 18:56:03
Subject: Re: Much Ado About COUNT(*)

Re: Much Ado About COUNT(*) - Mailing list pgsql-hackers

Previous

Next