pgsql: Fix postmaster's handling of a startup-process crash. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Fix postmaster's handling of a startup-process crash.
Date
Msg-id E1ZDFX8-0007Pj-RW@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix postmaster's handling of a startup-process crash.

Ordinarily, a failure (unexpected exit status) of the startup subprocess
should be considered fatal, so the postmaster should just close up shop
and quit.  However, if we sent the startup process a SIGQUIT or SIGKILL
signal, the failure is hardly "unexpected", and we should attempt restart;
this is necessary for recovery from ordinary backend crashes in hot-standby
scenarios.  I attempted to implement the latter rule with a two-line patch
in commit 442231d7f71764b8c628044e7ce2225f9aa43b67, but it now emerges that
that patch was a few bricks shy of a load: it failed to distinguish the
case of a signaled startup process from the case where the new startup
process crashes before reaching database consistency.  That resulted in
infinitely respawning a new startup process only to have it crash again.

To handle this properly, we really must track whether we have sent the
*current* startup process a kill signal.  Rather than add yet another
ad-hoc boolean to the postmaster's state, I chose to unify this with the
existing RecoveryError flag into an enum tracking the startup process's
state.  That seems more consistent with the postmaster's general state
machine design.

Back-patch to 9.0, like the previous patch.

Branch
------
REL9_0_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/6718f07a0b31219487f7e8094d35959d328c9b56

Modified Files
--------------
src/backend/postmaster/postmaster.c |   44 ++++++++++++++++++++++++-----------
1 file changed, 31 insertions(+), 13 deletions(-)


pgsql-committers by date:

Previous
From: Fujii Masao
Date:
Subject: pgsql: Fix obsolete comment regarding NOTICE message level.
Next
From: Tom Lane
Date:
Subject: pgsql: Fix postmaster's handling of a startup-process crash.