Re: Performance of count(*) on large tables vs SQL Server - Mailing list pgsql-performance

From Shridhar Daithankar
Subject Re: Performance of count(*) on large tables vs SQL Server
Date
Msg-id 200502011832.56333.ghodechhap@ghodechhap.net
Whole thread Raw
In response to Performance of count(*) on large tables vs SQL Server  (Andrew Mayo <ajmayo@kohuconsulting.com>)
Responses Re: Performance of count(*) on large tables vs SQL Server
List pgsql-performance
On Tuesday 01 Feb 2005 6:11 pm, Andrew Mayo wrote:
> PG, on the other hand, appears to do a full table scan
> to answer this question, taking nearly 4 seconds to
> process the query.
>
> Doing an ANALYZE on the table and also VACUUM did not
> seem to affect this.
>
> Can PG find a table's row count more efficiently?.
> This is not an unusual practice in commercial
> applications which assume that count(*) with no WHERE
> clause will be a cheap query  - and use it to test if
> a table is empty, for instance. (because for
> Oracle/Sybase/SQL Server, count(*) is cheap).

First of all, such an assumption is no good. It should hit concurrency under
heavy load but I know people do use it.

For the specific question, after a vacuum analyze, you can use

select reltuples from pg_class where relname='Foo';

Remember, you will get different results between 'analyze' and 'vacuum
analyze', since later actually visit every page in the table and hence is
expected to be more accurate.

> (sure, I appreciate there are other ways of doing
> this, but I am curious about the way PG works here).

Answer is MVCC and PG's inability use index alone. This has been a FAQ for a
loong time.. Furthermore PG has custom aggregates to complicate the matter..

Most of the pg developers/users think that unqualified select count(*) is of
no use. You can search the archives for more details..

 HTH

 Shridhar

pgsql-performance by date:

Previous
From: Andrew Mayo
Date:
Subject: Performance of count(*) on large tables vs SQL Server
Next
From: Stef
Date:
Subject: Re: Performance of count(*) on large tables vs SQL Server