Home > mailing lists

Re: Race conditions with checkpointer and shutdown - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Race conditions with checkpointer and shutdown
Date	April 19, 2019 04:02:48
Msg-id	7164.1555646568@sss.pgh.pa.us Whole thread
In response to	Re: Race conditions with checkpointer and shutdown (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Race conditions with checkpointer and shutdown
List	pgsql-hackers

Tree view

>>> Maybe what we should be looking for is "why doesn't the walreceiver
>>> shut down"?  But the dragonet log you quote above shows the walreceiver
>>> exiting, or at least starting to exit.  Tis a puzzlement.

huh ... take a look at this little stanza in PostmasterStateMachine:

    if (pmState == PM_SHUTDOWN_2)
    {
        /*
         * PM_SHUTDOWN_2 state ends when there's no other children than
         * dead_end children left. There shouldn't be any regular backends
         * left by now anyway; what we're really waiting for is walsenders and
         * archiver.
         *
         * Walreceiver should normally be dead by now, but not when a fast
         * shutdown is performed during recovery.
         */
        if (PgArchPID == 0 && CountChildren(BACKEND_TYPE_ALL) == 0 &&
            WalReceiverPID == 0)
        {
            pmState = PM_WAIT_DEAD_END;
        }
    }

I'm too tired to think through exactly what that last comment might be
suggesting, but it sure seems like it might be relevant to our problem.
If the walreceiver *isn't* dead yet, what's going to ensure that we
can move forward later?

            regards, tom lane

pgsql-hackers by date:

From: Amit Langote
Date: 19 April 2019, 04:00:24
Subject: Re: bug in update tuple routing with foreign partitions

From: Amit Langote
Date: 19 April 2019, 04:17:22
Subject: Re: bug in update tuple routing with foreign partitions

Re: Race conditions with checkpointer and shutdown - Mailing list pgsql-hackers

Previous

Next