Re: heap_hot_search_buffer refactoring - Mailing list pgsql-hackers

From Robert Haas
Subject Re: heap_hot_search_buffer refactoring
Date
Msg-id BANLkTi=7dmfg049PHayKiMqJdaynKj3xDw@mail.gmail.com
Whole thread Raw
In response to Re: heap_hot_search_buffer refactoring  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sun, Jun 19, 2011 at 2:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Jun 19, 2011 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Yikes.  I think you are right.  It's kind of scary that the regression
>>> tests passed with that mistake.
>>
>> Can we add a test that exposes that mistake?
>
> Not sure.  We'd have to figure out how to reliably tickle it.

*thinks a bit*

When using an MVCC snapshot, we always have first_call = true, so the
effect of this mistake was just to disable the opportunistic killing
of dead tuples, which doesn't affect correctness.

When using a non-MVCC snapshot, we call heap_hot_search_buffer()
repeatedly until it returns false.  For so long as it returns true, it
does not matter how all_dead is set, because index_getnext() will
return the tuple without examining all_dead.  So only the final call
matters.  If the final call happens to also be the first call, then
all_dead might end up being false when it really ought to be true, but
that will once again just miss killing a dead tuple.  If the final
call isn't the first call, then we've got a problem, because now
all_dead will be true when it really ought to be false, and we'll nuke
an index tuple that we shouldn't nuke.

But if this is happening in the context of CLUSTER, then there might
still be no user-visible failure, because we're going to rebuild the
indexes anyway.  There could be a problem if CLUSTER aborts part-way
though.

A system catalog might get scanned with SnapshotNow, but to exercise
the bug you'd need to HOT update a system catalog and then have the
updating transaction commit between the time it sees the first row and
the time it sees the second one.

So I don't quite see how to construct a test case, ATM.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [COMMITTERS] pgsql: Make external_pid_file world readable
Next
From: Florian Pflug
Date:
Subject: Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate