Thread: Unintended restart after recovery error

Unintended restart after recovery error

From
Antonin Houska
Date:
While looking at postmaster.c:reaper(), one problematic case occurred to me.


1. Startup process signals PMSIGNAL_RECOVERY_STARTED.

2. Checkpointer process is forked and immediately dies.

3. reaper() catches this failure, calls HandleChildCrash() and thus sets
FatalError to true.

4. Startup process exits with non-zero status code too - either due to SIGQUIT
received from HandleChildCrash or due to some other failure of the startup
process itself. However, FatalError is already set, because of the previous
crash of the checkpointer. Thus reaper() does not set RecoveryError.

5. As RecoverError failed to be set to true, postmaster will try to restart
the cluster, although it apparently should not.


I could simulate the problem using the following changes (and by installing
recovery.conf into DATA directory):


diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 99f702c..0cbd1c1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6178,6 +6178,12 @@ StartupXLOG(void)            SetForwardFsyncRequests();
SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);           bgwriterLaunched = true; 
+
+            /*
+             * Accidental delay, ensuring that checkpointer's crash is caught
+             * by PM first.
+             */
+            pg_usleep(5000000L);        }        /*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 6c814ba..4585119 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -194,6 +194,8 @@ CheckpointerMain(void)    sigjmp_buf    local_sigjmp_buf;    MemoryContext checkpointer_context;
+    ereport(FATAL, (errmsg("early failure")));
+    CheckpointerShmem->checkpointer_pid = MyProcPid;    /*


This works for me as a fix:

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6220a8e..0fb13bb 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2573,15 +2573,11 @@ reaper(SIGNAL_ARGS)             * After PM_STARTUP, any unexpected exit (including FATAL exit)
of            * the startup process is catastrophic, so kill other children,             * and set RecoveryError so we
don'ttry to reinitialize after 
-             * they're gone.  Exception: if FatalError is already set, that
-             * implies we previously sent the startup process a SIGQUIT, so
-             * that's probably the reason it died, and we do want to try to
-             * restart in that case.
+             * they're gone.             */            if (!EXIT_STATUS_0(exitstatus))            {
-                if (!FatalError)
-                    RecoveryError = true;
+                RecoveryError = true;                HandleChildCrash(pid, exitstatus,
_("startupprocess"));                continue; 

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: Unintended restart after recovery error

From
Antonin Houska
Date:
Antonin Houska <ah@cybertec.at> wrote:

> While looking at postmaster.c:reaper(), one problematic case occurred to me.
>
>
> 1. Startup process signals PMSIGNAL_RECOVERY_STARTED.
>
> 2. Checkpointer process is forked and immediately dies.
>
> 3. reaper() catches this failure, calls HandleChildCrash() and thus sets
> FatalError to true.
>
> 4. Startup process exits with non-zero status code too - either due to SIGQUIT
> received from HandleChildCrash or due to some other failure of the startup
> process itself. However, FatalError is already set, because of the previous
> crash of the checkpointer. Thus reaper() does not set RecoveryError.
>
> 5. As RecoverError failed to be set to true, postmaster will try to restart
> the cluster, although it apparently should not.

More common case occurred to me as soon as I sent the previous mail: any
process of standby cluster has died. Thus the proposed fix would make
restart_after_crash (GUC) completely ineffective for standbys. I'm not sure if
that's desired. Question is whether RecoveryError should reflect problems
during any kind of recovery, or just during crash recovery.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: Unintended restart after recovery error

From
Fujii Masao
Date:
On Wed, Nov 12, 2014 at 6:52 PM, Antonin Houska <ah@cybertec.at> wrote:
> While looking at postmaster.c:reaper(), one problematic case occurred to me.
>
>
> 1. Startup process signals PMSIGNAL_RECOVERY_STARTED.
>
> 2. Checkpointer process is forked and immediately dies.
>
> 3. reaper() catches this failure, calls HandleChildCrash() and thus sets
> FatalError to true.
>
> 4. Startup process exits with non-zero status code too - either due to SIGQUIT
> received from HandleChildCrash or due to some other failure of the startup
> process itself. However, FatalError is already set, because of the previous
> crash of the checkpointer. Thus reaper() does not set RecoveryError.
>
> 5. As RecoverError failed to be set to true, postmaster will try to restart
> the cluster, although it apparently should not.

Why shouldn't postmaster restart the cluster in that case?

Regards,

-- 
Fujii Masao



Re: Unintended restart after recovery error

From
Antonin Houska
Date:
Fujii Masao <masao.fujii@gmail.com> wrote:

> On Wed, Nov 12, 2014 at 6:52 PM, Antonin Houska <ah@cybertec.at> wrote:
> > While looking at postmaster.c:reaper(), one problematic case occurred to me.
> >
> >
> > 1. Startup process signals PMSIGNAL_RECOVERY_STARTED.
> >
> > 2. Checkpointer process is forked and immediately dies.
> >
> > 3. reaper() catches this failure, calls HandleChildCrash() and thus sets
> > FatalError to true.
> >
> > 4. Startup process exits with non-zero status code too - either due to SIGQUIT
> > received from HandleChildCrash or due to some other failure of the startup
> > process itself. However, FatalError is already set, because of the previous
> > crash of the checkpointer. Thus reaper() does not set RecoveryError.
> >
> > 5. As RecoverError failed to be set to true, postmaster will try to restart
> > the cluster, although it apparently should not.
>
> Why shouldn't postmaster restart the cluster in that case?
>

At least for the behavior to be consistent with simpler cases of failed
recovery (e.g. any FATAL error in StartupXLOG), which end up not restarting
the cluster.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: Unintended restart after recovery error

From
Robert Haas
Date:
On Wed, Nov 12, 2014 at 4:52 PM, Antonin Houska <ah@cybertec.at> wrote:
> Fujii Masao <masao.fujii@gmail.com> wrote:
>
>> On Wed, Nov 12, 2014 at 6:52 PM, Antonin Houska <ah@cybertec.at> wrote:
>> > While looking at postmaster.c:reaper(), one problematic case occurred to me.
>> >
>> >
>> > 1. Startup process signals PMSIGNAL_RECOVERY_STARTED.
>> >
>> > 2. Checkpointer process is forked and immediately dies.
>> >
>> > 3. reaper() catches this failure, calls HandleChildCrash() and thus sets
>> > FatalError to true.
>> >
>> > 4. Startup process exits with non-zero status code too - either due to SIGQUIT
>> > received from HandleChildCrash or due to some other failure of the startup
>> > process itself. However, FatalError is already set, because of the previous
>> > crash of the checkpointer. Thus reaper() does not set RecoveryError.
>> >
>> > 5. As RecoverError failed to be set to true, postmaster will try to restart
>> > the cluster, although it apparently should not.
>>
>> Why shouldn't postmaster restart the cluster in that case?
>>
>
> At least for the behavior to be consistent with simpler cases of failed
> recovery (e.g. any FATAL error in StartupXLOG), which end up not restarting
> the cluster.

It's true that if the startup process dies we don't try to restart,
but it's also true that if the checkpointer dies we do try to restart.
I'm not sure why this specific situation should be an exception to
that general rule.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Unintended restart after recovery error

From
Fujii Masao
Date:
On Thu, Nov 13, 2014 at 8:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Nov 12, 2014 at 4:52 PM, Antonin Houska <ah@cybertec.at> wrote:
>> Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>>> On Wed, Nov 12, 2014 at 6:52 PM, Antonin Houska <ah@cybertec.at> wrote:
>>> > While looking at postmaster.c:reaper(), one problematic case occurred to me.
>>> >
>>> >
>>> > 1. Startup process signals PMSIGNAL_RECOVERY_STARTED.
>>> >
>>> > 2. Checkpointer process is forked and immediately dies.
>>> >
>>> > 3. reaper() catches this failure, calls HandleChildCrash() and thus sets
>>> > FatalError to true.
>>> >
>>> > 4. Startup process exits with non-zero status code too - either due to SIGQUIT
>>> > received from HandleChildCrash or due to some other failure of the startup
>>> > process itself. However, FatalError is already set, because of the previous
>>> > crash of the checkpointer. Thus reaper() does not set RecoveryError.
>>> >
>>> > 5. As RecoverError failed to be set to true, postmaster will try to restart
>>> > the cluster, although it apparently should not.
>>>
>>> Why shouldn't postmaster restart the cluster in that case?
>>>
>>
>> At least for the behavior to be consistent with simpler cases of failed
>> recovery (e.g. any FATAL error in StartupXLOG), which end up not restarting
>> the cluster.
>
> It's true that if the startup process dies we don't try to restart,
> but it's also true that if the checkpointer dies we do try to restart.
> I'm not sure why this specific situation should be an exception to
> that general rule.

442231d7f71764b8c628044e7ce2225f9aa43b6 introduced the latter rule
for hot-standby case. Maybe *during crash recovery* (i.e., hot standby
should not be enabled) it's better to treat the crash of startup process as
a catastrophic crash.

Regards,

-- 
Fujii Masao



Re: Unintended restart after recovery error

From
Antonin Houska
Date:
Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Nov 13, 2014 at 8:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> It's true that if the startup process dies we don't try to restart,
>> but it's also true that if the checkpointer dies we do try to restart.
>> I'm not sure why this specific situation should be an exception to
>> that general rule.

My distinction was "during recovery" vs "outside recovery", rather than
"startup process" vs "checkpointer". But I'm not sure it's easy enough to
recognize that checkpointer (maybe also bgwriter) no longer participates in
recovery.

> 442231d7f71764b8c628044e7ce2225f9aa43b6 introduced the latter rule
> for hot-standby case.

I didn't fully understand the purpose of this condition until I saw the commit
message. Thanks for pointing out.

> Maybe *during crash recovery* (i.e., hot standby should not be enabled) it's
> better to treat the crash of startup process as a catastrophic crash.

Yes, that's what I thought too. But I think the current StartupXLOG() does not
always (need to) determine the exact boundary between crash and archive
recovery. I'd need to think more if the end of crash recovery can be safely
identified during the replay and signaled to postmaster.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: Unintended restart after recovery error

From
Robert Haas
Date:
On Thu, Nov 13, 2014 at 10:59 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> 442231d7f71764b8c628044e7ce2225f9aa43b6 introduced the latter rule
> for hot-standby case. Maybe *during crash recovery* (i.e., hot standby
> should not be enabled) it's better to treat the crash of startup process as
> a catastrophic crash.

Maybe, but why, specifically?  If the startup process failed
internally, it's probably because it hit an error during the replay of
some WAL record.  So if we restart it, it will back up to the previous
checkpoint or restartpoint, replay the same WAL records as before, and
die again in the same spot.  We don't want it to sit there and do that
forever in an infinite loop, so it makes sense to kill the whole
server.

But if the startup process was killed off because the checkpointer
croaked, that logic doesn't necessarily apply.  There's no reason to
assume that the replay of a particular WAL record was what killed the
checkpointer; in fact, it seems like the odds are against it.  So it
seems right to fall back to our general principle of restarting the
server and hoping that's enough to get things back on line.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company