Thread: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

From

Noah Misch

Date:

20 June 2024, 01:29:08

https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote:
> Separable, nontrivial things not fixed in the attached patch stack:

> - Trouble is possible, I bet, if the system crashes between the inplace-update
>   memcpy() and XLogInsert().  See the new XXX comment below the memcpy().

That comment:

    /*----------
     * XXX A crash here can allow datfrozenxid() to get ahead of relfrozenxid:
     *
     * ["D" is a VACUUM (ONLY_DATABASE_STATS)]
     * ["R" is a VACUUM tbl]
     * D: vac_update_datfrozenid() -> systable_beginscan(pg_class)
     * D: systable_getnext() returns pg_class tuple of tbl
     * R: memcpy() into pg_class tuple of tbl
     * D: raise pg_database.datfrozenxid, XLogInsert(), finish
     * [crash]
     * [recovery restores datfrozenxid w/o relfrozenxid]
     */

>   Might solve this by inplace update setting DELAY_CHKPT, writing WAL, and
>   finally issuing memcpy() into the buffer.

That fix worked.  Along with that, I'm attaching a not-for-commit patch with a
test case and one with the fix rebased on that test case.  Apply on top of the
v2 patch stack from https://postgr.es/m/20240617235854.f8.nmisch@google.com.
This gets key testing from 027_stream_regress.pl; when I commented out some
memcpy lines of the heapam.c change, that test caught it.

This resolves the last inplace update defect known to me.

Thanks,
nm

Attachment

Re: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

From

"Andrey M. Borodin"

Date:

20 June 2024, 07:17:44

> On 20 Jun 2024, at 06:29, Noah Misch <noah@leadboat.com> wrote:
>
> This resolves the last inplace update defect known to me.

That’s a huge amount of work, thank you!

Do I get it right, that inplace updates are catalog-specific and some other OOM corruptions [0] and Standby corruptions
[1]are not related to this fix. Bot cases we observed on regular tables. 
Or that might be effect of vacuum deepening corruption after observing wrong datfrozenxid?

Best regards, Andrey Borodin.

[0]
https://www.postgresql.org/message-id/flat/67EADE8F-AEA6-4B73-8E38-A69E5D48BAFE%40yandex-team.ru#1266dd8b898ba02686c2911e0a50ab47
[1] https://www.postgresql.org/message-id/flat/CAFj8pRBEFMxxFSCVOSi-4n0jHzSaxh6Ze_cZid5eG%3Dtsnn49-A%40mail.gmail.com

Re: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

From

Noah Misch

Date:

20 June 2024, 15:08:43

On Thu, Jun 20, 2024 at 12:17:44PM +0500, Andrey M. Borodin wrote:
> On 20 Jun 2024, at 06:29, Noah Misch <noah@leadboat.com> wrote:
> > This resolves the last inplace update defect known to me.
> 
> That’s a huge amount of work, thank you!
> 
> Do I get it right, that inplace updates are catalog-specific and some other OOM corruptions [0] and Standby
corruptions[1] are not related to this fix. Bot cases we observed on regular tables.

In core code, inplace updates are specific to pg_class and pg_database.
Adding PGXN modules, only the citus extension uses them on some other table.
[0] definitely looks unrelated.

> Or that might be effect of vacuum deepening corruption after observing wrong datfrozenxid?

Wrong datfrozenxid can cause premature clog truncation, which can cause "could
not access status of transaction".  While $SUBJECT could cause that, I think
it would happen on both primary and standby.  [1] seems to be about a standby
lacking clog present on the primary, which is unrelated.

> [0]
https://www.postgresql.org/message-id/flat/67EADE8F-AEA6-4B73-8E38-A69E5D48BAFE%40yandex-team.ru#1266dd8b898ba02686c2911e0a50ab47
> [1]
https://www.postgresql.org/message-id/flat/CAFj8pRBEFMxxFSCVOSi-4n0jHzSaxh6Ze_cZid5eG%3Dtsnn49-A%40mail.gmail.com

Re: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

From

Noah Misch

Date:

06 April, 21:00:54

On Wed, Jun 19, 2024 at 06:29:08PM -0700, Noah Misch wrote:
> https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote:
> > Separable, nontrivial things not fixed in the attached patch stack:
> 
> > - Trouble is possible, I bet, if the system crashes between the inplace-update
> >   memcpy() and XLogInsert().  See the new XXX comment below the memcpy().
> 
> That comment:
> 
>     /*----------
>      * XXX A crash here can allow datfrozenxid() to get ahead of relfrozenxid:
>      *
>      * ["D" is a VACUUM (ONLY_DATABASE_STATS)]
>      * ["R" is a VACUUM tbl]
>      * D: vac_update_datfrozenid() -> systable_beginscan(pg_class)
>      * D: systable_getnext() returns pg_class tuple of tbl
>      * R: memcpy() into pg_class tuple of tbl
>      * D: raise pg_database.datfrozenxid, XLogInsert(), finish
>      * [crash]
>      * [recovery restores datfrozenxid w/o relfrozenxid]
>      */
> 
> >   Might solve this by inplace update setting DELAY_CHKPT, writing WAL, and
> >   finally issuing memcpy() into the buffer.
> 
> That fix worked.

I pushed that as commit 8e7e672 (2024-10-25).  I now think DELAY_CHKPT_START
is superfluous here, per this proc.h comment:

 * (In the
 * extremely common case where the data being modified is in shared buffers
 * and we acquire an exclusive content lock on the relevant buffers before
 * writing WAL, this mechanism is not needed, because phase 2 will block
 * until we release the content lock and then flush the modified data to
 * disk.)

heap_inplace_update_and_unlock() meets those conditions.  Its closest
precedent is XLogSaveBufferForHint(), which does need DELAY_CHKPT_START due to
having only BUFFER_LOCK_SHARED.  heap_inplace_update_and_unlock() has
BUFFER_LOCK_EXCLUSIVE, so it doesn't need DELAY_CHKPT_START.

I'm considering two options:

1. Stop using DELAY_CHKPT_START in heap_inplace_update_and_unlock().

2. Add a comment.  BUFFER_LOCK_EXCLUSIVE entitles
   heap_inplace_update_and_unlock() to omit DELAY_CHKPT_START.  Instead, the
   function elects to keep itself aligned with XLogSaveBufferForHint(), to
   make analysis simpler.

I'm leaning toward (2), since inplace update isn't frequent enough that it
needs to pursue small optimizations.  Would anyone strongly prefer (1)?

Re: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

From

Noah Misch

Date:

13 April, 23:37:58

On Sun, Apr 06, 2025 at 11:00:54AM -0700, Noah Misch wrote:
> I pushed that as commit 8e7e672 (2024-10-25).  I now think DELAY_CHKPT_START
> is superfluous here, per this proc.h comment:
> 
>  * (In the
>  * extremely common case where the data being modified is in shared buffers
>  * and we acquire an exclusive content lock on the relevant buffers before
>  * writing WAL, this mechanism is not needed, because phase 2 will block
>  * until we release the content lock and then flush the modified data to
>  * disk.)
> 
> heap_inplace_update_and_unlock() meets those conditions.  Its closest
> precedent is XLogSaveBufferForHint(), which does need DELAY_CHKPT_START due to
> having only BUFFER_LOCK_SHARED.

True so far, but ...

> heap_inplace_update_and_unlock() has
> BUFFER_LOCK_EXCLUSIVE, so it doesn't need DELAY_CHKPT_START.

... not so, because we've not yet done MarkBufferDirty().  transam/README
cites SyncOneBuffer(), which says:

     * Check whether buffer needs writing.
     *
     * We can make this check without taking the buffer content lock so long
     * as we mark pages dirty in access methods *before* logging changes with
     * XLogInsert(): if someone marks the buffer dirty just after our check we
     * don't worry because our checkpoint.redo points before log record for
     * upcoming changes and so we are not required to write such dirty buffer.

The attached patch updates the aforementioned proc.h comment and the
heap_inplace_update_and_unlock() comment that my last message proposed.

Attachment

inplace280-comment-delay-v1.patch