pgsql: Fix (re-)starting from a basebackup taken off a standby after a - Mailing list pgsql-committers

From Andres Freund
Subject pgsql: Fix (re-)starting from a basebackup taken off a standby after a
Date
Msg-id E1Y1WE4-0008Hu-BR@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix (re-)starting from a basebackup taken off a standby after a failure.

When starting up from a basebackup taken off a standby extra logic has
to be applied to compute the point where the data directory is
consistent. Normal base backups use a WAL record for that purpose, but
that isn't possible on a standby.

That logic had a error check ensuring that the cluster's control file
indicates being in recovery. Unfortunately that check was too strict,
disregarding the fact that the control file could also indicate that
the cluster was shut down while in recovery.

That's possible when the a cluster starting from a basebackup is shut
down before the backup label has been removed. When everything goes
well that's a short window, but when either restore_command or
primary_conninfo isn't configured correctly the window can get much
wider. That's because inbetween reading and unlinking the label we
restore the last checkpoint from WAL which can need additional WAL.

To fix simply also allow starting when the control file indicates
"shutdown in recovery". There's nicer fixes imaginable, but they'd be
more invasive.

Backpatch to 9.2 where support for taking basebackups from standbys
was added.

Branch
------
REL9_4_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/ce083254919d08043fc5c8b060f322f6bc84d1c0

Modified Files
--------------
src/backend/access/transam/xlog.c |   17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)


pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: pgsql: Fix (re-)starting from a basebackup taken off a standby after a
Next
From: Andres Freund
Date:
Subject: pgsql: Fix (re-)starting from a basebackup taken off a standby after a