Re: count(*) using index scan in "query often, update rarely" environment - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: count(*) using index scan in "query often, update rarely" environment
Date
Msg-id 6EE64EF3AB31D5448D0007DD34EEB3417DD541@Herge.rcsinc.local
Whole thread Raw
In response to count(*) using index scan in "query often, update rarely" environment  ("Cestmir Hybl" <cestmirl@freeside.sk>)
List pgsql-performance
On 10/7/05, Cestmir Hybl <cestmirl@freeside.sk> wrote:
Isn't it possible (and reasonable) for these environments to keep track
of whether there is a transaction in progress with update to given table
and if not, use an index scan (count(*) where) or cached value
(count(*)) to perform this kind of query?
________________________________________

The answer to the first question is subtle.  Basically, the PostgreSQL
engine is designed for high concurrency.  We are definitely on the right
side of the cost/benefit tradeoff here.  SQL server does not have MVCC
(or at least until 2005 appears) so they are on the other side of the
tradeoff.

You can of course serialize the access yourself by materializing the
count in a small table and use triggers or cleverly designed
transactions.  This is trickier than it might look however so check the
archives for a thorough treatment of the topic.

One interesting thing is that making count(*) over large swaths of data
is frequently an indicator of a poorly normalized database.  Is it
possible to optimize the counting by laying out your data in a different
way?

Merlin



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: count(*) using index scan in "query often, update rarely" environment
Next
From: Richard Huxton
Date:
Subject: Re: count(*) using index scan in "query often, update rarely"