Re: Add checkpoint and redo LSN to LogCheckpointEnd log message - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Add checkpoint and redo LSN to LogCheckpointEnd log message |
Date | |
Msg-id | 20220207.120258.310426179780547983.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Add checkpoint and redo LSN to LogCheckpointEnd log message (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Add checkpoint and redo LSN to LogCheckpointEnd log message
|
List | pgsql-hackers |
At Mon, 07 Feb 2022 10:16:34 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Fri, 4 Feb 2022 10:59:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > On 2022/02/03 15:50, Kyotaro Horiguchi wrote: > > > By the way, restart point should start only while recoverying, and at > > > the timeof the start both checkpoint.redo and checkpoint LSN are > > > already past. We shouldn't update minRecovery point after promotion, > > > but is there any reason for not updating the checkPoint and > > > checkPointCopy? If we update them after promotion, the > > > which-LSN-to-show problem would be gone. > > > > I tried to find the reason by reading the past discussion, but have > > not found that yet. > > > > If we update checkpoint and REDO LSN at pg_control in that case, we > > also need to update min recovery point at pg_control? Otherwise the > > min recovery point at pg_control still indicates the old LSN that > > previous restart point set. > > I had an assuption that the reason I think it shouldn't update > minRecoveryPoint is that it has been or is going to be reset to > invalid LSN by promotion and the checkpoint should refrain from > touching it. Hmm.. It doesn't seem to be the case. If a server crashes just after promotion and before requesting post-promtion checkpoint, minRecoveryPoint stays at a valid LSN. (Promoted at 0/7000028) Database cluster state: in production Latest checkpoint location: 0/6000060 Latest checkpoint's REDO location: 0/6000028 Latest checkpoint's REDO WAL file: 000000010000000000000006 Minimum recovery ending location: 0/7000090 Min recovery ending loc's timeline: 2 minRecoveryPoint/TLI are ignored in any case where a server in in-production state is started. In other words, the values are useless. There's no clear or written reason for unrecording the last ongoing restartpoint after promotion. Before fast-promotion was introduced, we shouldn't get there after end-of-recovery checkpoint (but somehow reached sometimes?) but it is quite normal nowadays. Or to the contrary, we're expecting it to happen and it is regarded as a normal checkponit. So we should do there nowadays are as the follows. - If any later checkpoint/restartpoint has been established, just skip remaining task then return false. (!chkpt_was_latest) (I'm not sure this can happen, though.) - we update control file only when archive recovery is still ongoing. - Otherwise reset minRecoveryPoint then continue. Do you have any thoughts or opinions? regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 958220c495..ab8a4d9a1b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9658,6 +9658,7 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + bool chkpt_was_latest = false; /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); @@ -9752,44 +9753,69 @@ CreateRestartPoint(int flags) PriorRedoPtr = ControlFile->checkPointCopy.redo; /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Update pg_control, using current time if no later checkpoints have been + * performed. */ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->checkPointCopy.redo < lastCheckPoint.redo) { + chkpt_was_latest = true; ControlFile->checkPoint = lastCheckPointRecPtr; ControlFile->checkPointCopy = lastCheckPoint; /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened + * already while writing out dirty buffers, but not necessarily - + * e.g. because no buffers were dirtied. We do this because a + * non-exclusive base backup uses minRecoveryPoint to determine which + * WAL files must be included in the backup, and the file (or files) + * containing the checkpoint record must be included, at a + * minimum. Note that for an ordinary restart of recovery there's no + * value in having the minimum recovery point any earlier than this * anyway, because redo will begin just after the checkpoint record. */ - if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->minRecoveryPoint = lastCheckPointEndPtr; - ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID; + if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) + { + ControlFile->minRecoveryPoint = lastCheckPointEndPtr; + ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID; - /* update local copy */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + /* update local copy */ + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + } + if (flags & CHECKPOINT_IS_SHUTDOWN) + ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; + } + else + { + /* + * We have exited from archive-recovery mode after starting. Crash + * recovery ever after should always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; } - if (flags & CHECKPOINT_IS_SHUTDOWN) - ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; UpdateControlFile(); } LWLockRelease(ControlFileLock); + /* + * Skip all post-checkpoint work if others have done that with later + * checkpoints. + */ + if (!chkpt_was_latest) + { + ereport((log_checkpoints ? LOG : DEBUG2), + (errmsg("post-restartpoint cleanup is skpped at %X/%X, because later restartpoints have been already performed", + LSN_FORMAT_ARGS(lastCheckPoint.redo)))); + + /* this checkpoint has not bee established */ + return false; + } + /* * Update the average distance between checkpoints/restartpoints if the * prior checkpoint exists.
pgsql-hackers by date: