Re: count(*) slow on large tables - Mailing list pgsql-performance

From Bruce Momjian
Subject Re: count(*) slow on large tables
Date
Msg-id 200310041556.h94Fuek24423@candle.pha.pa.us
Whole thread Raw
In response to Re: count(*) slow on large tables  (Christopher Browne <cbbrowne@libertyrms.info>)
Responses Re: count(*) slow on large tables
List pgsql-performance
Christopher Browne wrote:
> jllachan@nsd.ca (Jean-Luc Lachance) writes:
> > That's one of the draw back of MVCC.
> > I once suggested that the transaction number and other house keeping
> > info be included in the index, but was told to forget it...
> > It would solve once and for all the issue of seq_scan vs index_scan.
> > It would simplify the aggregate problem.
>
> It would only simplify _one_ case, namely the case where someone cares
> about the cardinality of a relation, and it would do that at
> _considerable_ cost.
>
> A while back I outlined how this would have to be done, and for it to
> be done efficiently, it would be anything BUT simple.
>
> It would be very hairy to implement it correctly, and all this would
> cover is the single case of "SELECT COUNT(*) FROM SOME_TABLE;"

We do have a TODO item:

    * Consider using MVCC to cache count(*) queries with no WHERE clause

The idea is to cache a recent count of the table, then have
insert/delete add +/- records to the count.  A COUNT(*) would get the
main cached record plus any visible +/- records.  This would allow the
count to return the proper value depending on the visibility of the
requesting transaction, and it would require _no_ heap or index scan.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-performance by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: reindex/vacuum locking/performance?
Next
From: Tom Lane
Date:
Subject: COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)