pgsql: Fix control file update done in restartpoints still running afte - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Fix control file update done in restartpoints still running afte
Date
Msg-id E1nqQTB-000v4B-Dm@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix control file update done in restartpoints still running after promotion

If a cluster is promoted (aka the control file shows a state different
than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still
processing, this function could miss an update of the control file for
"checkPoint" and "checkPointCopy" but still do the recycling and/or
removal of the past WAL segments, assuming that the to-be-updated LSN
values should be used as reference points for the cleanup.  This causes
a follow-up restart attempting crash recovery to fail with a PANIC on a
missing checkpoint record if the end-of-recovery checkpoint triggered by
the promotion did not complete while the cluster abruptly stopped or
crashed before the completion of this checkpoint.  The PANIC would be
caused by the redo LSN referred in the control file as located in a
segment already gone, recycled by the previous restartpoint with
"checkPoint" out-of-sync in the control file.

This commit fixes the update of the control file during restartpoints so
as "checkPoint" and "checkPointCopy" are updated even if the cluster has
been promoted while a restartpoint is running, to be on par with the set
of WAL segments actually recycled in the end of CreateRestartPoint().

7863ee4 has fixed this problem already on master, but the release timing
of the latest point versions did not let me enough time to study and fix
that on all the stable branches.

Reported-by: Fujii Masao, Rui Zhao
Author: Kyotaro Horiguchi
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com
Backpatch-through: 10

Branch
------
REL_12_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/7e59b1219109be1b42d1f3937d09637327a56f5b

Modified Files
--------------
src/backend/access/transam/xlog.c | 44 ++++++++++++++++++++++++++-------------
1 file changed, 29 insertions(+), 15 deletions(-)


pgsql-committers by date:

Previous
From: Alvaro Herrera
Date:
Subject: pgsql: Add link to HBA docs in initdb --auth documentation
Next
From: David Rowley
Date:
Subject: pgsql: Fix incorrect row estimates used for Memoize costing