Re: Return pg_control from pg_backup_stop(). - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Return pg_control from pg_backup_stop().
Date
Msg-id abov3-Kre8ntiL-S@paquier.xyz
Whole thread Raw
In response to Re: Return pg_control from pg_backup_stop().  (David Steele <david@pgbackrest.org>)
Responses Re: Return pg_control from pg_backup_stop().
List pgsql-hackers
On Wed, Mar 18, 2026 at 04:05:24AM +0000, David Steele wrote:
> On 3/18/26 08:43, Michael Paquier wrote:
>> I was wondering if we should have an assertion at least to cross-check
>> that the contents we store in shared memory never go out-of-sync with
>> the on-disk contents, in the shape of a USE_ASSERT_CHECKING block that
>> calls get_controlfile() and memcmp()'s the contents between shmem and
>> the on-disk file, while the LWLock is taken.  We ship the control file
>> last on purpose, one reason being backups taken from standbys, so that
>> may be sensible to do.
>
> As far as I can see this should always be true -- I audited all the
>
> LWLockAcquire(ControlFileLock, LW_EXCLUSIVE)
>
> sections and the file is always saved once if is updated. Let me see if I
> can add this check without too much pain, e.g. an additional parameter.

This matches with my reads of the code.  The attached check, that can
be applied on top of your patches, passes under check-world.

>> Another property of the new control file flag that is implied in the
>> implementation but not documented is that we should never check for
>> backupLabelRequired when a backup_label is gone.
>
> I'm not sure what you mean here? That's exactly when we do want to check as
> below:

Sorry for the confusion, I meant that "we should never check for
backupLabelRequired when we have a backup_label".

> Using the pg_control copy from pg_backup_stop() is entirely optional and
> nothing is broken if vendors decide not to use it. This can be demonstrated
> by applying the 01 patch without 02. In that case the tests in
> 042_low_level_backup continue to run. You can also apply 01 and 02 and
> revert the test changes in 042_low_level_backup and that works, too.

FWIW, after a second look I am actually wondering if 0002 is safe at
all.  The contents of the control file are fetched after we are done
with do_pg_backup_stop(), and there could be a bunch of activity that
happens between the end of do_pg_backup_stop() and the
backup_control_file() call, where the control file could be updated
more and interfere with the recovery startup for some of its fields?
GUC parameter updates that may touch the control file are one thing
popping into mind.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Sharma
Date:
Subject: Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Next
From: Michael Paquier
Date:
Subject: Re: DOCS: typo on CLUSTER page