Re: Much Ado About COUNT(*) - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Much Ado About COUNT(*)
Date
Msg-id 6EE64EF3AB31D5448D0007DD34EEB3412A75AA@Herge.rcsinc.local
Whole thread Raw
In response to Much Ado About COUNT(*)  ("Jonah H. Harris" <jharris@tvi.edu>)
List pgsql-hackers
> Tom, Bruce, and others involved in this recurring TODO discussion...
>
> First, let me start by saying that I understand this has been
discussed
> many times before; however, I'd like to see what the current state of
> affairs is regarding the possibility of using a unique index scan to
> speed up the COUNT aggregate.

To sum up:
1.  There are good technical reasons why not to do this.  The pg
aggregate system is very elegant...not worth compromising it for a
specific case.
2.  postgresql can do many things faster than oracle.  If you prefer the
way oracle behaves, use oracle.
3.  workaround #1: just run analyze once in a while (you should do that
anyways) and query pg_Class for the #tuples in a relation.
4.  workaround #2: rig up a materialized view and query that.  This will
be faster than what oracle does, btw, at the price of some coherency.
5.  understand that count(*) from t, although frequently used, is of
dubious value in the general sense.  Sooner or later someone will
optimize this, but in light of the currently available workarounds it
doesn't seem that important.
6.  for large tables, you can get a pretty accurate count by doing:
select count(*) * 10 from t where random() > .9;
on my setup, this shaved about 15% off of the counting time...YMMV.

Merlin



pgsql-hackers by date:

Previous
From: Reinhard Max
Date:
Subject: segfault caused by heimdal (was: SUSE port)
Next
From: "Jonah H. Harris"
Date:
Subject: Re: Much Ado About COUNT(*)