Re: Interrupts vs signals - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Interrupts vs signals
Date
Msg-id CA+hUKGJvt8wg_KfjFLvr27LT6sYK_szngkchig6Dg_CnNGPskw@mail.gmail.com
Whole thread Raw
In response to Re: Interrupts vs signals  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Interrupts vs signals
Re: Interrupts vs signals
List pgsql-hackers
On Mon, Jul 8, 2024 at 9:38 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The patch actually does both: it still does kill(SIGUSR1) and also sets
> the latch.

Yeah, I had some ideas about supporting old extension code that really
wanted a SIGUSR1, but on reflection, the only reason anyone ever wants
that is so that sigusr1_handler can SetLatch(), which pairs with
WaitLatch() in WaitForBackgroundWorker*().  Let's go all the way and
assume that.

> I think it would be nice if RegisterDynamicBackgroundWorker() had a
> "bool notify_me" argument, instead of requiring the caller to set
> "bgw_notify_pid = MyProcPid" before the call. That's a
> backwards-compatibility break, but maybe we should bite the bullet and
> do it. Or we could do this in RegisterDynamicBackgroundWorker():
>
> if (worker->bgw_notify_pid == MyProcPid)
>      worker->bgw_notify_pgprocno = MyProcNumber;
>
> I think we can forbid setting pgw_notify_pid to anything else than 0 or
> MyProcPid.

Another idea: we could keep the bgw_notify_pid field around for a
while, documented as unused and due to be removed in future.  We could
automatically capture the caller's proc number.  So you get latch
wakeups by default, which I expect many people want, and most people
could cope with even if they don't want them.  If you really really
don't want them, you could set a new flag BGW_NO_NOTIFY.

I have now done this part of the change in a new first patch.  This
particular use of kill(SIGUSR1) is separate from the ProcSignal
removal, it's just that it relies on ProcSignal's handler's default
action of calling SetLatch().  It's needed so the ProcSignal-ectomy
can fully delete sigusr1_handler(), but it's not directly the same
thing, so it seemed good to split the patch.

> A SetLatchRobust would be nice. Looking at SetLatch(), I don't think it
> can do any damage if you called it on a pointer to garbage, except if
> the pointer itself is bogus, then just dereferencing it an cause a
> segfault. So it would be nice to have a version specifically designed
> with that in mind. For example, it could assume that the Latch's pid is
> never legally equal to MyProcPid, because postmaster cannot own any latches.

Yeah I'm starting to think that all we need to do here is range-check
the proc number.  Here's a version that adds:
ProcSetLatch(proc_number).  Another idea would be for SetLatch(latch)
to sanitise the address of a latch, ie its offset and range.

What the user really wants to be able to do with this bgworker API, I
think, is wait for a given a handle, which could find a condition
variable + generation in the slot, so that we don't have to register
any proc numbers anywhere until we're actually waiting.  But *clearly*
the postmaster can't use the condition variable API without risking
following corrupted pointers in shared memory.  Whereas AFAICT
ProcSetLatch() from the patched postmaster can't really be corrupted
in any new way that it couldn't already be corrupted in master (where
it runs in the target process), if we're just a bit paranoid about how
we find our way to the latch.

Receiving latch wakeups in the postmaster might be another question,
but I don't think we need to confront that question just yet.

> > -volatile uint32 InterruptHoldoffCount = 0;
> > -volatile uint32 QueryCancelHoldoffCount = 0;
> > -volatile uint32 CritSectionCount = 0;
> > +uint32 InterruptHoldoffCount = 0;
> > +uint32 QueryCancelHoldoffCount = 0;
> > +uint32 CritSectionCount = 0;
>
> I wondered if these are used in PG_TRY-CATCH blocks in a way that would
> still require volatile. I couldn't find any such usage by some quick
> grepping, so I think we're good, but I thought I'd mention it.

Hmm.  Still thinking about this.

> > +/*
> > + * The set of "standard" interrupts that CHECK_FOR_INTERRUPTS() and
> > + * ProcessInterrupts() handle.  These perform work that is safe to run whenever
> > + * interrupts are not "held".  Other kinds of interrupts are only handled at
> > + * more restricted times.
> > + */
> > +#define INTERRUPT_STANDARD_MASK                                                         \
>
> Some interrupts are missing from this mask:
>
> - INTERRUPT_PARALLEL_APPLY_MESSAGE

Oops, that one ^ is a rebasing mistake.  I wrote the ancestor of this
patch in 2021, and that new procsignal arrived in 2023, and I put the
code in to handle it, but I forgot to add it to the mask, which gives
me an idea (see below)...

> - INTERRUPT_IDLE_STATS_UPDATE_TIMEOUT
> - INTERRUPT_SINVAL_CATCHUP
> - INTERRUPT_NOTIFY
>
> Is that on purpose?

INTERRUPT_SINVAL_CATCHUP and INTERRUPT_NOTIFY are indeed handled
differently on purpose.  In master, they don't set InterruptPending,
and they are not handled by regular CHECK_FOR_INTERRUPTS() sites, but
in the patch they still need a bit in pending_interrupts, and that is
what that mask hides from CHECK_FOR_INTERRUPTS().  They are checked
explicitly in ProcessClientReadInterrupt().  I think the idea is that
we can't handle sinval at random places because that might create
dangling pointers to cached objects where we don't expect them, and we
can't emit NOTIFY-related protocol messages at random times either.

There is something a little funky about _IDLE_STATS_UPDATE_TIMEOUT,
though.  It has a different scheme for running only when idle, where
if it opts not to do anything, it doesn't consume the interrupt (a
later CFI() will, but the latch is not set so we might sleep).  I was
confused by that.  I think I have changed to be more faithful to
master's behaviour now.

Hmm, a better terminology for the interupts that CFI handles would be
s/standard/regular/, so I have changed that.

New idea: it would be less error-prone if we instead had a mask of
these special cases, of which there are now only two.  Tried that way!

> > -/*
> > - * Because backends sitting idle will not be reading sinval events, we
> > - * need a way to give an idle backend a swift kick in the rear and make
> > - * it catch up before the sinval queue overflows and forces it to go
> > - * through a cache reset exercise.  This is done by sending
> > - * PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far behind.
> > - *
> > - * The signal handler will set an interrupt pending flag and will set the
> > - * processes latch. Whenever starting to read from the client, or when
> > - * interrupted while doing so, ProcessClientReadInterrupt() will call
> > - * ProcessCatchupEvent().
> > - */
> > -volatile sig_atomic_t catchupInterruptPending = false;
>
> Would be nice to move that comment somewhere else rather than remove it
> completely.

OK, I moved it to the top of ProcessCatchupInterrupt().

> > --- a/src/backend/storage/lmgr/proc.c
> > +++ b/src/backend/storage/lmgr/proc.c
> > @@ -444,6 +444,10 @@ InitProcess(void)
> >       OwnLatch(&MyProc->procLatch);
> >       SwitchToSharedLatch();
> >
> > +     /*We're now ready to accept interrupts from other processes. */
> > +     pg_atomic_init_u32(&MyProc->pending_interrupts, 0);
> > +     SwitchToSharedInterrupts();
> > +
> >       /* now that we have a proc, report wait events to shared memory */
> >       pgstat_set_wait_event_storage(&MyProc->wait_event_info);
> >
> > @@ -611,6 +615,9 @@ InitAuxiliaryProcess(void)
> >       OwnLatch(&MyProc->procLatch);
> >       SwitchToSharedLatch();
> >
> > +     /* We're now ready to accept interrupts from other processes. */
> > +     SwitchToSharedInterrupts();
> > +
> >       /* now that we have a proc, report wait events to shared memory */
> >       pgstat_set_wait_event_storage(&MyProc->wait_event_info);
> >
>
> Is there a reason for the different initialization between a regular
> backend and aux process?

No.  But I thought about something else to fix here.  Really we don't
want to switch to shared interrupts until we are ready for CFI() to do
stuff.  I think that should probably be at the places where master
unblocks signals.  Otherwise, for example, if someone sends you an
interrupt while you're starting up, something as innocent as
elog(DEBUG, ...), which reaches CFI(), might try to do things for
which the infrastructure is not yet fully set up, for example
INTERRUPT_BARRIER.

Not done yet, but wanted to share this new version.

> > +/*
> > + * Switch to shared memory interrupts.  Other backends can send interrupts
> > + * to this one if they know its ProcNumber.
> > + */
> > +void
> > +SwitchToSharedInterrupts(void)
> > +{
> > +     pg_atomic_fetch_or_u32(&MyProc->pending_interrupts, pg_atomic_read_u32(MyPendingInterrupts));
> > +     MyPendingInterrupts = &MyProc->pending_interrupts;
> > +}
>
> Hmm, I think there's a race condition here (and similarly in
> SwitchToLocalInterrupts), if the signal handler runs between the
> pg_atomic_fetch_or_u32, and changing MyPendingInterrupts. Maybe
> something like this instead:
>
> MyPendingInterrupts = &MyProc->pending_interrupts;
> pg_memory_barrier();
> pg_atomic_fetch_or_u32(&MyProc->pending_interrupts,
> pg_atomic_read_u32(LocalPendingInterrupts));

Yeah, right, done.

> And perhaps this should also clear LocalPendingInterrupts, just to be tidy.

I used atomic_exchange() to read and clear the bits in one go.

> > @@ -138,7 +139,8 @@
> >  typedef struct ProcState
> >  {
> >       /* procPid is zero in an inactive ProcState array entry. */
> > -     pid_t           procPid;                /* PID of backend, for signaling */
> > +     pid_t           procPid;                /* pid of backend */
> > +     ProcNumber      pgprocno;               /* for sending interrupts */
> >       /* nextMsgNum is meaningless if procPid == 0 or resetState is true. */
> >       int                     nextMsgNum;             /* next message number to read */
> >       bool            resetState;             /* backend needs to reset its state */
>
> We can easily remove procPid altogether now that we have pgprocno here.

Since other things access those values, I propose to remove them in
separate patches.

> Similarly with the pid/pgprocno in ReplicationSlot and WalSndState.

Same.  Those pids show up in user interfaces, so I think they should
be handled in separate patches.

Note to self: I need to change some pgprocno to proc_number...

The next problems to remove are, I think, the various SIGUSR2, SIGINT,
SIGTERM signals sent by the postmaster.  These should clearly become
SendInterrupt() or ProcSetLatch().  The problem here is that the
postmaster doesn't have the proc numbers yet.  One idea is to teach
the postmaster to assign them!  Not explored yet.

This version is passing on Windows.  I'll create a CF entry.  Still
work in progress!

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Is creating logical replication slots in template databases useful at all?
Next
From: Antonin Houska
Date:
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?