Re: standalone backend PANICs during recovery - Mailing list pgsql-hackers

From Tom Lane
Subject Re: standalone backend PANICs during recovery
Date
Msg-id 2086.1471711308@sss.pgh.pa.us
Whole thread Raw
In response to Re: standalone backend PANICs during recovery  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: standalone backend PANICs during recovery  (Michael Paquier <michael.paquier@gmail.com>)
Re: standalone backend PANICs during recovery  (Bernd Helmle <mailings@oopsware.de>)
List pgsql-hackers
I wrote:
> In short, I don't think control should have been here at all.  My proposal
> for a fix is to force EnableHotStandby to remain false in a standalone
> backend.

I tried to reproduce Bernd's problem by starting a standalone backend in
a data directory that was configured as a hot standby slave, and soon
found that there are much bigger issues than this.  The startup sequence
soon tries to wait for WAL to arrive, which in HEAD uses
           WaitLatch(&XLogCtl->recoveryWakeupLatch,                     WL_LATCH_SET | WL_TIMEOUT |
WL_POSTMASTER_DEATH,                    5000L);
 

which immediately elog(FATAL)s because a standalone backend has no parent
postmaster and so postmaster_alive_fds[] isn't set.  But if it didn't do
that, it'd wait forever because of course there is no active WAL receiver
process that would ever provide more WAL.

The only way that you'd ever get to a command prompt is if somebody made a
promotion trigger file, which would cause the startup code to promote the
cluster into master status, which does not really seem like something that
would be a good idea in Bernd's proposed use case of "investigating a
problem".

Alternatively, if we were to force standby_mode off in a standalone
backend, it would come to the command prompt right away but again it would
have effectively promoted the cluster to master.  That is certainly not
something we should ever do automatically.

So at this point I'm pretty baffled as to what the actual use-case is
here.  I am tempted to say that a standalone backend should refuse to
start at all if a recovery.conf file is present.  If we do want to
allow the case, we need some careful thought about what it should do.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: replication slots replicated to standbys?
Next
From: Petr Jelinek
Date:
Subject: Re: Logical Replication WIP