On Fri, Nov 12, 2021 at 5:57 PM Peter Geoghegan <pg@bowt.ie> wrote:
> You said it yourself: who knows exactly what the justification for
> RECENTLY_DEAD->DEAD was? I have to imagine it had something to do with the
> "INSERT_IN_PROGRESS becomes DEAD due to concurrent xact abort" thing,
> but that's unclear. And even if it was clear, and even if we knew that
> it was 100% safe at one point, it still wouldn't be clear that it's
> safe today, in Postgres 14.
Another relevant factor is how we deal with already-corrupt HOT chains
affected by the bug. I would be comfortable with a full "can't happen"
error in the new code path for disconnected and aborted heap-only
tuples, provided the error only gets raised when the tuple is fully
LIVE according to HTSV (and also assert that it's DEAD). Something
like my v4 plus this LIVE-should-be-DEAD defensive error seems very
likely to avoid making the corruption any worse. There is a huge
amount of redundancy in the tuple headers that we can cross check
inexpensively.
--
Peter Geoghegan