Re: Much Ado About COUNT(*) - Mailing list pgsql-hackers

From Manfred Koizar
Subject Re: Much Ado About COUNT(*)
Date
Msg-id j9hlu0t1h3mnruj2uhlge0fn0qmsha1utr@4ax.com
Whole thread Raw
In response to Re: Much Ado About COUNT(*)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Much Ado About COUNT(*)
Re: Much Ado About COUNT(*)
List pgsql-hackers
On Thu, 13 Jan 2005 00:39:56 -0500, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>A would-be deleter of a tuple would have to go and clear the "known
>good" bits on all the tuple's index entries before it could commit.
>This would bring the tuple back into the "uncertain status" condition
>where backends would have to visit the heap to find out what's up.
>Eventually the state would become certain again (either dead to
>everyone or live to everyone) and one or the other hint bit could be
>set again.

Last time we discussed this, didn't we come to the conclusion, that
resetting status bits is not a good idea because of possible race
conditions?

In a previous post you wrote:
| I think we still have one free bit in index tuple headers...

AFAICS we'd need two new bits: "visible to all" and "maybe dead".

Writing the three status bits in the order "visible to all", "maybe
dead", "known dead", a normal index tuple lifecycle would be
 000 -> 100 -> 110 -> 111

In states 000 and 110 the heap tuple has to be read to determine
visibility.

The transitions 000 -> 100 and 110 -> 111 happen as side effects of
index scans.  100 -> 110 has to be done by the deleting transaction.
This is the operation where the additional run time cost lies.

One weakness of this approach is that once the index tuple state is
110 but the deleting transaction is aborted there is no easy way to
reset the "maybe deleted" bit.  So we'd have to consult the heap for
the rest of the tuple's lifetime.

ServusManfred



pgsql-hackers by date:

Previous
From: lsunley@mb.sympatico.ca
Date:
Subject: Re: OS/2 port regression tests
Next
From: Heikki Linnakangas
Date:
Subject: Re: WAL logging of heap_mark4update