>>> Maybe what we should be looking for is "why doesn't the walreceiver
>>> shut down"? But the dragonet log you quote above shows the walreceiver
>>> exiting, or at least starting to exit. Tis a puzzlement.
huh ... take a look at this little stanza in PostmasterStateMachine:
if (pmState == PM_SHUTDOWN_2)
{
/*
* PM_SHUTDOWN_2 state ends when there's no other children than
* dead_end children left. There shouldn't be any regular backends
* left by now anyway; what we're really waiting for is walsenders and
* archiver.
*
* Walreceiver should normally be dead by now, but not when a fast
* shutdown is performed during recovery.
*/
if (PgArchPID == 0 && CountChildren(BACKEND_TYPE_ALL) == 0 &&
WalReceiverPID == 0)
{
pmState = PM_WAIT_DEAD_END;
}
}
I'm too tired to think through exactly what that last comment might be
suggesting, but it sure seems like it might be relevant to our problem.
If the walreceiver *isn't* dead yet, what's going to ensure that we
can move forward later?
regards, tom lane