Re: Strange count(*) implementation? - Mailing list pgsql-general

From Richard Huxton
Subject Re: Strange count(*) implementation?
Date
Msg-id 417E1093.1080401@archonet.com
Whole thread Raw
In response to Strange count(*) implementation?  (Henk Ernst Blok <h.e.blok@utwente.nl>)
List pgsql-general
Henk Ernst Blok wrote:
> Hi Posgres users/developers,
>
> Can anyone explain why PosgreSQL (version 7.4.5 on Linux) does a full
> table scan to compute a count(*) on a base table after a vacuum analyze
> has been done with no following updates that might have outdated any
> statistics. Strangly the explain command does give the correct number of
> tuples instantaniously from the catalog, as one would expect. Still the
> optimizer thinks it needs a full table scan to do count.

> The consequence of this seemingly odd count implementation is a very
> very slow count.

To put it simply, count() doesn't look at the statistics because most of
the time they are out of date. In any case, they aren't useful for any
query with a WHERE clause.

The next most obvious choice is to use the primary-key index rather than
scanning the table. However, MVCC means that we can have multiple views
of the table, with some backends seeing a different number of rows than
others. So - either we need to store multiple index-entry versions as
well as multiple row versions or you need to check the actual row in
these cases. PostgreSQL does the second, which results in the full scan
which you see.

There is plenty of discussion of this (and also max()/min() aggregate
functions) in the mailing list archives.

HTH
--
   Richard Huxton
   Archonet Ltd

pgsql-general by date:

Previous
From: Richard Huxton
Date:
Subject: Re: what could cause inserts getting queued up and db locking??
Next
From: Joel
Date:
Subject: compatibilityissues from 7.1 to 7.4