Re: SSI-related code drift between index_getnext() and heap_hot_search_buffer() - Mailing list pgsql-hackers

From Robert Haas
Subject Re: SSI-related code drift between index_getnext() and heap_hot_search_buffer()
Date
Msg-id BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com
Whole thread Raw
In response to Re: SSI-related code drift between index_getnext() and heap_hot_search_buffer()  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Fri, May 13, 2011 at 12:10 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> FWIW, so far what I know is that it will take an example something
> like the one shown here:
>
> http://archives.postgresql.org/pgsql-hackers/2011-02/msg00325.php
>
> with the further requirements that the update in T3 must not be a
> HOT update, T1 would still need to acquire a snapshot before T2
> committed while moving its current select down past the commit of
> T3, and that select would need to be modified so that it would scan
> the visible tuple and then stop (e.g., because of a LIMIT) before
> reaching the tuple which represents the next version of the row.

I think I see another problem here.  Just before returning each tuple,
index_getnext() records in the IndexScanDesc the offset number of the
next tuple in the HOT chain, and the XMAX of the tuple being returned.On the next call, it will go on to examine that
TIDchecking, among
 
other things, whether the XMIN of the tuple at that location matches
the previously stored XMAX.  But no buffer content locks is held
across calls.  So consider a HOT chain A -> B.  After returning A, the
IndexScanDesc will consider that we should next look at B.  Now B
rolls back, and a new transaction updates A, so we now have A -> C.
(I believe this is possible.)  When the next call to index_getnext()
occurs, it'll look at B and consider that it's reached the end of the
HOT chain - but in reality it has not, because it has never looked at
C.

Now, prior to SSI, I believe this did not matter, because the only
time we traversed the entire HOT chain rather than stopping at the
first visible tuple was when we were using a non-MVCC snapshot.
According to Heikki's submission notes for the patch I was trying to
rebase, the only time that happens is during CLUSTER, at which point
we have an AccessExclusiveLock on the table.  But SSI wants to
traverse the whole HOT chain even when using an MVCC snapshot, so now
we (maybe) have a problem.

I think I have an inkling of how to plug this, but first I have to go
buy groceries.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Reducing overhead of frequent table locks
Next
From: Tom Lane
Date:
Subject: Re: Reducing overhead of frequent table locks