Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date
Msg-id CAH2-WznkPUmQEd-7Li35Lbram4agSZsEg_SB92j5xKvw-2QB4Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
List pgsql-bugs
On Thu, Nov 11, 2021 at 4:58 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > What prevents the scenario that some other backend e.g. has a snapshot with
> > xmin=xmax=RECENTLY_DEAD-row. If the RECENTLY_DEAD row has an xid that is later
> > than the DEAD row, this afaict would make it perfectly legal to prune the DEAD
> > row, but *not* the RECENTLY_DEAD one.
>
> I'll need to think about this very carefully. I didn't think it was
> worth blocking v3 on, though naturally it's a big concern.

If we're to traverse HOT chains right to the end in
heap_prune_chain(), reading even LIVE tuples (per the approach
proposed in my bugfix patch), we probably need to be more careful
about concurrently aborted xacts -- relying on the usual
!HeapTupleHeaderIsHotUpdated(htup) test doesn't seem safe.

Imagine if we land on a concurrently-aborted DEAD tuple at the end of
a physical HOT chain -- this might not be caught before we test the
previous tuple in the chain using HeapTupleHeaderIsHotUpdated(htup) --
the abort might happen just as we land on the final/aborted tuple. We
certainly shouldn't conclude that the whole HOT chain is now DEAD,
just because that one tuple is dead.

That definitely cannot happen on HEAD, I think, because we just give
up as soon as we see anything that isn't either DEAD or RECENTLY_DEAD.
But maybe it's possible with the patch.

-- 
Peter Geoghegan



pgsql-bugs by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: BUG #17245: Index corruption involving deduplicated entries
Next
From: Peter Geoghegan
Date:
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum