Re: COUNT and Performance ... - Mailing list pgsql-hackers

From Tom Lane
Subject Re: COUNT and Performance ...
Date
Msg-id 12357.1044209054@sss.pgh.pa.us
Whole thread Raw
In response to COUNT and Performance ...  (Hans-Jürgen Schönig <postgres@cybertec.at>)
Responses Re: COUNT and Performance ...  (Neil Conway <neilc@samurai.com>)
List pgsql-hackers
Hans-Jürgen Schönig <postgres@cybertec.at> writes:
> In special cases there can be another way to avoid seq scans:
> [ use pgstattuple() ]

But pgstattuple does do a sequential scan of the table.  You avoid a lot
of the executor's tuple-pushing and plan-node-traversing machinery that
way, but the I/O requirement is going to be exactly the same.

> If people want to count ALL rows of a table. The contrib stuff is pretty 
> useful. It seems to be transaction safe.

Not entirely.  pgstattuple uses HeapTupleSatisfiesNow(), which means you
get a count of tuples that are committed good in terms of the effects of
transactions committed up to the instant each tuple is examined.  This
is in general different from what count(*) would tell you, because it
ignores snapshotting.  It'd be quite unrepeatable too, in the face of
active concurrent changes --- it's very possible for pgstattuple to
count a single row twice or not at all, if it's being concurrently
updated and the other transaction commits between the times pgstattuple
sees the old and new versions of the row.

> The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz):

I think your test case is small enough that the whole table is resident
in memory, so this measurement only accounts for CPU time per tuple and
not any I/O.  Given the small size of pgstattuple's per-tuple loop, the
speed differential is not too surprising --- but it won't scale up to
larger tables.

Sometime it would be interesting to profile count(*) on large tables
and see exactly where the CPU time goes.  It might be possible to shave
off some of the executor overhead ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Hans-Jürgen Schönig
Date:
Subject: Re: COUNT and Performance ...
Next
From: Curt Sampson
Date:
Subject: Re: Linux.conf.au 2003 Report