datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE - Mailing list pgsql-hackers

From Noah Misch
Subject datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE
Date
Msg-id 20240620012908.92.nmisch@google.com
Whole thread Raw
Responses Re: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE
List pgsql-hackers
https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote:
> Separable, nontrivial things not fixed in the attached patch stack:

> - Trouble is possible, I bet, if the system crashes between the inplace-update
>   memcpy() and XLogInsert().  See the new XXX comment below the memcpy().

That comment:

    /*----------
     * XXX A crash here can allow datfrozenxid() to get ahead of relfrozenxid:
     *
     * ["D" is a VACUUM (ONLY_DATABASE_STATS)]
     * ["R" is a VACUUM tbl]
     * D: vac_update_datfrozenid() -> systable_beginscan(pg_class)
     * D: systable_getnext() returns pg_class tuple of tbl
     * R: memcpy() into pg_class tuple of tbl
     * D: raise pg_database.datfrozenxid, XLogInsert(), finish
     * [crash]
     * [recovery restores datfrozenxid w/o relfrozenxid]
     */

>   Might solve this by inplace update setting DELAY_CHKPT, writing WAL, and
>   finally issuing memcpy() into the buffer.

That fix worked.  Along with that, I'm attaching a not-for-commit patch with a
test case and one with the fix rebased on that test case.  Apply on top of the
v2 patch stack from https://postgr.es/m/20240617235854.f8.nmisch@google.com.
This gets key testing from 027_stream_regress.pl; when I commented out some
memcpy lines of the heapam.c change, that test caught it.

This resolves the last inplace update defect known to me.

Thanks,
nm

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: suspicious valgrind reports about radixtree/tidstore on arm64
Next
From: Peter Smith
Date:
Subject: Re: Pgoutput not capturing the generated columns