Re: Possible corruption by CreateRestartPoint at promotion - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Possible corruption by CreateRestartPoint at promotion
Date
Msg-id 20220427032609.GA3220948@nathanxps13
Whole thread Raw
In response to Re: Possible corruption by CreateRestartPoint at promotion  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Possible corruption by CreateRestartPoint at promotion  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Wed, Apr 27, 2022 at 10:43:53AM +0900, Kyotaro Horiguchi wrote:
> At Tue, 26 Apr 2022 11:33:49 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in 
>> I suspect we'll start seeing this problem more often once end-of-recovery
>> checkpoints are removed [0].  Would you mind creating a commitfest entry
>> for this thread?  I didn't see one.
> 
> I'm not sure the patch makes any change here, because restart points
> don't run while crash recovery, since no checkpoint records seen
> during a crash recovery.  Anyway the patch doesn't apply anymore so
> rebased, but only the one for master for the lack of time for now.

Thanks for the new patch!  Yeah, it wouldn't affect crash recovery, but
IIUC Robert's patch also applies to archive recovery.

> +    /* Update pg_control */
> +    LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
>      /*
>       * Remember the prior checkpoint's redo ptr for
>       * UpdateCheckPointDistanceEstimate()
>       */
>      PriorRedoPtr = ControlFile->checkPointCopy.redo;

nitpick: Why move the LWLockAcquire() all the way up here?

> +    Assert (PriorRedoPtr < RedoRecPtr);

I think this could use a short explanation.

> +     * Ensure minRecoveryPoint is past the checkpoint record while archive
> +     * recovery is still ongoing.  Normally, this will have happened already
> +     * while writing out dirty buffers, but not necessarily - e.g. because no
> +     * buffers were dirtied.  We do this because a non-exclusive base backup
> +     * uses minRecoveryPoint to determine which WAL files must be included in
> +     * the backup, and the file (or files) containing the checkpoint record
> +     * must be included, at a minimum. Note that for an ordinary restart of
> +     * recovery there's no value in having the minimum recovery point any
> +     * earlier than this anyway, because redo will begin just after the
> +     * checkpoint record.

nitpick: Since exclusive backup mode is now removed, we don't need to
specify that the base backup is non-exclusive.

> +        /*
> +         * Aarchive recovery has ended. Crash recovery ever after should
> +         * always recover to the end of WAL
> +         */
> +        ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
> +        ControlFile->minRecoveryPointTLI = 0;
> +
> +        /* also update local copy */
> +        LocalMinRecoveryPoint = InvalidXLogRecPtr;
> +        LocalMinRecoveryPointTLI = 0;

Should this be handled by the code that changes the control file state to
DB_IN_PRODUCTION instead?  It looks like this is ordinarily done in the
next checkpoint.  It's not clear to me why it is done this way.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Add contrib/pg_walinspect.
Next
From: Bharath Rupireddy
Date:
Subject: Re: pgsql: Add contrib/pg_walinspect.