Re: Much Ado About COUNT(*) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Much Ado About COUNT(*)
Date
Msg-id 5528.1105594796@sss.pgh.pa.us
Whole thread Raw
In response to Re: Much Ado About COUNT(*)  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Much Ado About COUNT(*)  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Much Ado About COUNT(*)  (Manfred Koizar <mkoi-pg@aon.at>)
Re: Much Ado About COUNT(*)  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Ah, right, I missed the connection.  Hmm ... that's sort of the inverse
>> of the "killed tuple" optimization we put in a release or two back,
>> where an index tuple is marked as definitely dead once it's committed
>> dead and the deletion is older than all active transactions.

> Yes, it is sort of the reverse, but how do you get around the delete
> case?

A would-be deleter of a tuple would have to go and clear the "known
good" bits on all the tuple's index entries before it could commit.
This would bring the tuple back into the "uncertain status" condition
where backends would have to visit the heap to find out what's up.
Eventually the state would become certain again (either dead to
everyone or live to everyone) and one or the other hint bit could be
set again.

The ugly part of this is that clearing the bit is not like setting a
hint bit, ie it's not okay if we lose that change.  Therefore, each
bit-clearing would have to be WAL-logged.  This is a big part of my
concern about the cost.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Much Ado About COUNT(*)
Next
From: Mark Wong
Date:
Subject: Re: pg_autovacuum w/ dbt2