On Thu, Sep 17, 2020 at 10:19 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 17/09/2020 12:48, Thomas Munro wrote:
> > So I think we should do
> > something like what Heikki originally proposed to lower the frequency
> > of checks, on systems where we don't have the ability to skip the
> > check completely. Please see attached.
>
> If you put the counter in HandleStartupProcInterrupts(), it could be a
> long wait if the startup process is e.g. waiting for WAL to arrive in
> the loop in WaitForWALToBecomeAvailable(), or in recoveryPausesHere().
> My original patch only reduced the frequency in the WAL redo loop, when
> you're actively replaying records.
Oh, I checked that WaitForWALToBecomeAvailable() already handled
postmaster death via events rather than polling, with
WL_EXIT_ON_PM_DEATH, but I hadn't clocked that recoveryPausesHere()
uses pg_usleep() and polling. Hmm. Perhaps we should change that
instead? The reason I did it that way is that I didn't want to make
the new ProcSignalBarrierPending handler less reactive.
> We could probably do better on Windows. Maybe the signal handler thread
> could wait on the PostmasterHandle at the same time that it waits on the
> signal pipe, and set postmaster_possibly_dead. But I'm not going to work
> on that, and it would only help on Windows, so I'm OK with just adding
> the counter.
Yeah, I had the same thought.