Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date
Msg-id CAH2-Wz=2wAftxnZdUjKPpnjyXESqjq90-=DOjmDZg_2HiiT4NQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs
On Fri, Nov 12, 2021 at 5:57 PM Peter Geoghegan <pg@bowt.ie> wrote:
> You said it yourself: who knows exactly what the justification for
> RECENTLY_DEAD->DEAD was? I have to imagine it had something to do with the
> "INSERT_IN_PROGRESS becomes DEAD due to concurrent xact abort" thing,
> but that's unclear. And even if it was clear, and even if we knew that
> it was 100% safe at one point, it still wouldn't be clear that it's
> safe today, in Postgres 14.

Another relevant factor is how we deal with already-corrupt HOT chains
affected by the bug. I would be comfortable with a full "can't happen"
error in the new code path for disconnected and aborted heap-only
tuples, provided the error only gets raised when the tuple is fully
LIVE according to HTSV (and also assert that it's DEAD). Something
like my v4 plus this LIVE-should-be-DEAD defensive error seems very
likely to avoid making the corruption any worse. There is a huge
amount of redundancy in the tuple headers that we can cross check
inexpensively.

-- 
Peter Geoghegan



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #17283: localhost should also include IPv6
Next
From: Padmakumar Kadayaprth
Date:
Subject: Re: Logical Replication not working for few Tables