Re: Reorder shutdown sequence, to flush pgstats later - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Reorder shutdown sequence, to flush pgstats later
Date
Msg-id d2cd8fd3-396a-4390-8f0b-74be65e72899@iki.fi
Whole thread Raw
Responses Re: Reorder shutdown sequence, to flush pgstats later
List pgsql-hackers
On 08/01/2025 04:10, Andres Freund wrote:
> I instead opted to somewhat rearrange the shutdown sequence:
> 
>      This commit changes the shutdown sequence so checkpointer is signalled to
>      trigger writing the shutdown checkpoint without terminating it. Once
>      checkpointer wrote the checkpoint it will wait for a termination
>      signal.
> 
>      Postmaster now triggers the shutdown checkpoint where we previously did so by
>      terminating checkpointer. Checkpointer is terminated after all other children
>      have been terminated, tracked using the new PM_SHUTDOWN_CHECKPOINTER PMState.
> 
>      In addition, the existing PM_SHUTDOWN_2 state is renamed to
>      PM_XLOG_IS_SHUTDOWN, that seems a lot more descriptive.

Sounds good.

> This patch is not fully polished, there are a few details I'm not sure about:
> 
> - Previously the shutdown checkpoint and process exit were triggered from
>    HandleCheckpointerInterrupts().  I could have continued doing so in the
>    patch, but it seemed quite weird to have a wait loop in a function called
>    HandleCheckpointerInterrupts().
> 
>    Instead I made the main loop in CheckpointerMain() terminate if checkpointer
>    was asked to write the shutdown checkpoint or exit
> 
> - In process_pm_pmsignal() the code now triggers a PostmasterStateMachine() if
>    checkpointer signalled ShutdownXLOG having completed. We didn't really have
>    a need for that before. process_pm_child_exit() does call
>    PostmasterStateMachine(), but somehow calling it in process_pm_pmsignal()
>    feels slightly off

Looks good to me.

> - I don't love the naming of the various PMState values (existing and new),
>    but a larger renaming should probably be done separately?

We currently have PM_WAIT_BACKENDS and PM_WAIT_DEAD_END. Maybe follow 
that example and name all the shutdown states after what you're waiting for:

     PM_WAIT_BACKENDS, /* waiting for live backends to exit */
     PM_WAIT_XLOG_SHUTDOWN, /* waiting for checkpointer to do shutdown 
ckpt */
     PM_WAIT_XLOG_ARCHIVAL, /* waiting for archiver and walsenders to
                               finish */
     PM_WAIT_DEAD_END,      /* waiting for dead-end children to exit */
     PM_WAIT_CHECKPOINTER,  /* waiting for checkpointer to shut down */
     PM_NO_CHILDREN,        /* all important children have exited */

> - There's logging in ShutdownXLOG() that's only triggered if called from
>    checkpointer. Seems kinda odd - shouldn't we just move that to
>    checkpointer.c?

What logging do you mean? The "LOG: shutting down" message? No 
objections to moving it, although it doesn't bother me either way.

> The first two patches are just logging improvements that I found helpful to
> write this patch:
> 
> 0001: Instead of modifying pmState directly, do so via a new function
>        UpdatePMState(), which logs the state transition at DEBUG2.
> 
>        Requires a helper to stringify PMState.
> 
>        I've written one-off versions of this patch quite a few times. Without
>        knowing in what state postmaster is in, it's quite hard to debug the
>        state machine.

+1

>     elog(DEBUG1, "updating PMState from %d/%s to %d/%s",
>          pmState, pmstate_name(pmState), newState, pmstate_name(newState));

I think the state names are enough, no need to print the numerical values.

> 0002: Make logging of postmaster signalling child processes more consistent
> 
>        I found it somewhat hard to understand what's happening during state
>        changes without being able to see the signals being sent. While we did
>        have logging in SignalChildren(), we didn't in signal_child(), and most
>        signals that are important for the shutdown sequence are sent via
>        signal_child().

+1

I don't think the cases where DEBUG2 or DEBUG4 are used currently are 
very well thought out. To save a few lines of code, I'd be happy with 
merging signal_child_ext() and signal_child() and just always use DEBUG2 
or DEBUG4. Or DEBUG3 as a compromise :-).

> The next is a small prerequisite:
> 
> 0003: postmaster: Introduce variadic btmask_all_except()
> 
>        I proposed this in another thread, where I also needed
>        btmask_all_except3() but then removed the only call to
>        btmask_all_except2(). Instead of adding/removing code, it seems better
>        to just use a variadic function.

Nice. A variadic btmask_add() might be handy too.

> And then 0004, the reason for this thread.

Overall, looks good to me.

-- 
Heikki Linnakangas
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Jakub Wartak
Date:
Subject: Re: doc: Mention clock synchronization recommendation for hot_standby_feedback
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: ecpg command does not warn COPY ... FROM STDIN;