[COMMITTERS] pgsql: Prevent possibility of panics during shutdown checkpoint. - Mailing list pgsql-committers

From Andres Freund
Subject [COMMITTERS] pgsql: Prevent possibility of panics during shutdown checkpoint.
Date
Msg-id E1dI4BO-0004tT-BD@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Prevent possibility of panics during shutdown checkpoint.

When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and
throws a PANIC if so.  At that point, only walsenders are still
active, so one might think this could not happen, but walsenders can
also generate WAL, for instance in BASE_BACKUP and logical decoding
related commands (e.g. via hint bits).  So they can trigger this panic
if such a command is run while the shutdown checkpoint is being
written.

To fix this, divide the walsender shutdown into two phases.  First,
checkpointer, itself triggered by postmaster, sends a
PROCSIG_WALSND_INIT_STOPPING signal to all walsenders.  If the backend
is idle or runs an SQL query this causes the backend to shutdown, if
logical replication is in progress all existing WAL records are
processed followed by a shutdown.  Otherwise this causes the walsender
to switch to the "stopping" state. In this state, the walsender will
reject any further replication commands. The checkpointer begins the
shutdown checkpoint once all walsenders are confirmed as
stopping. When the shutdown checkpoint finishes, the postmaster sends
us SIGUSR2. This instructs walsender to send any outstanding WAL,
including the shutdown checkpoint record, wait for it to be replicated
to the standby, and then exit.

Author: Andres Freund, based on an earlier patch by Michael Paquier
Reported-By: Fujii Masao, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de
Backpatch: 9.4, where logical decoding was introduced

Branch
------
REL9_5_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/50581f2e74fa1835e9981e064c8b15691c36cdd7

Modified Files
--------------
src/backend/access/transam/xlog.c           |  11 ++
src/backend/replication/walsender.c         | 190 ++++++++++++++++++++++++----
src/backend/storage/ipc/procsignal.c        |   4 +
src/include/replication/walsender.h         |   3 +
src/include/replication/walsender_private.h |   3 +-
src/include/storage/procsignal.h            |   1 +
6 files changed, 184 insertions(+), 28 deletions(-)


pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: [COMMITTERS] pgsql: Prevent possibility of panics during shutdown checkpoint.
Next
From: Andres Freund
Date:
Subject: [COMMITTERS] pgsql: Unify SIGHUP handling between normal and walsender backends.