Re: Interrupts vs signals - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Interrupts vs signals
Date
Msg-id b88a764a-b2e4-4dac-9d25-7249215db2b7@iki.fi
Whole thread Raw
In response to Re: Interrupts vs signals  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Here's a new rebased and massively re-worked patch.

The patches are split differently than in the previous version:
- Patches 0001-0005 are just refactoring around recovery conflicts which 
I posted on a separate thread [1]. Please review and comment there.
- Patches 0006 and 0007 are small cleanups that could be applied early 
(after updating the docs, as noted in TODO comment there).
- All the interesting bits for this thread are in the last, massive patch.

To review this, I suggest starting from the new 
src/backend/ipc/README.md file. It gives a good overview of the 
mechanism (or if it doesn't, that's valuable feedback :-) ). It also 
contains a bunch of Open Questions at the bottom; I'd love to hear 
opinions and ideas on those.

Wrt. the issue of running out of bits: My current plan is to just switch 
to 64-bit interrupt masks. That gives a fair amount of headroom. Would 
be nice to have even more headroom, but I think 64 bits will be enough 
for a long time.

The big high-level design change in this version is to address these issues:

On 15/07/2025 19:50, Andres Freund wrote:
> I'm not really in love how this looks like yet, tbh. Aspects of it seem to buy
> further into some of our already existing design mistakes. I have two main
> concerns:
> 
> 1) It seems rather bonkers that we have to go to every process type and add
> code for INTERRUPT_LOG_MEMORY_CONTEXT or INTERRUPT_CONFIG_RELOAD.
> 
> 2) The fact that a lot of code specifies explicit interrupt masks makes that
> code harder to compose, because whether we can process some interrupt or not
> depends on higher level state.  That's e.g. why INTERRUPT_CFI_MASK() has to be
> a bit magic and decid ewhat interrupt mask to use and why we need a separate
> interrupt handling functions for interrupts happening while waiting for
> network IO than normally.
> 
> 
> For 1), I suspect that we should have a central mapping array between
> InterruptTypes and the code to handle each of the interrupt types.  That
> mapping table also could serve as the registry for extensions to register
> their own interrupt ids.

That's a great idea, and that's what this patch now does. You can now 
register handler functions for interrupts which will then be called by 
CHECK_FOR_INTERRUPTS() whenever the corresponding interrupt is pending. 
ProcessInterrupts has no knowledge about what each interrupt means, it 
just calls the handler functions in an array.

At backend startup, a "default" set of interrupt handlers are installed, 
like this: (a shortened excerpt from SetStandardInterruptHandlers())

    /* All processes should react to barriers and memory context debugging */
    SetInterruptHandler(INTERRUPT_BARRIER, ProcessProcSignalBarrier);
    EnableInterrupt(INTERRUPT_BARRIER);
    SetInterruptHandler(INTERRUPT_LOG_MEMORY_CONTEXT, 
ProcessLogMemoryContextInterrupt);
    EnableInterrupt(INTERRUPT_LOG_MEMORY_CONTEXT);

    /*
     * Every process should react to INTERRUPT_TERMINATE. But many processes
     * disable this and do their own checks at appropriate times.
     */
    SetInterruptHandler(INTERRUPT_TERMINATE, ProcessTerminateInterrupt);
    EnableInterrupt(INTERRUPT_TERMINATE);

    ...

Many processes have additional interrupts, or want different handlers 
for the standard interrupts. They make additional SetInterruptHandler() 
calls to set up their process-specific handlers.

This largely solves problems 1) and 2). I replaced the 
INTERRUPT_CFI_MASK() macro with a global variable 
CheckForInterruptsMask. At all times, it is a bitmask of all the 
interrupts that CHECK_FOR_INTERRUPTS() would process and clear. Even 
HOLD/RESUME_INTERRUPTS() now updates CheckForInterruptsMask (setting it 
to zero in HOLD_INTERRUPTS(), and restoring it on RESUME_INTERRUPTS())

With that facility, a lot of things got nicer. For example, I got rid of 
ProcessClientReadInterrupt() altogether. secure_read() now calls plain 
old CHECK_FOR_INTERRUPTS(), and it is the caller's responsibility to 
enable/disable the interrupt handlers according to what should or should 
not be processed. (ProcessClientWriteInterrupt() is gone too, although 
now that I look at it, I wonder if that went too far and we should still 
refrain from processing interrupts other than INTERRUPT_TERMINATE while 
we're sending to the client...)

I'd love to get feedback on this mechanism. And on the patch in general, 
of course, but this is the big new part.

> For 2), I wonder if we ought to have a global mask of interrupt kinds that can
> be processed in some context. Instead of having INTERRUPT_CFI_MASK() compute
> what mask to use, we could have things like HOLD_CANCEL_INTERRUPTS be defined
> as something like
> 
> if (InterruptHoldCount[CANCEL]++ == 0)
>     InterruptMask &= ~CANCEL;
> 
> which would allow CHECK_FOR_INTERRUPTS to just use InterruptMask to check for
> to-be-processed interrupts.

I played with that, and it's quite sensible, but then I realized that we 
only use HOLD_CANCEL_INTERRUPTS() in a few places, namely when we are 
reading a message from the client. We don't really need the nesting 
capability, i.e. we don't need a counter, a boolean is enough.


On 15/07/2025 18:50, Andres Freund wrote:
>> @@ -205,6 +206,8 @@ StartupProcExit(int code, Datum arg)
>>      /* Shutdown the recovery environment */
>>      if (standbyState != STANDBY_DISABLED)
>>          ShutdownRecoveryTransactionEnvironment();
>> +
>> +    ProcGlobal->startupProc = INVALID_PROC_NUMBER;
>>  }
> 
> 
> What if we instead had a ProcGlobal->auxProc[auxproxtype]? We have different
> versions of this for different types auf aux processes, which doesn't really
> make sense.

I like that idea, but didn't try it yet.

>> +         * Perform a plain atomic read first as a fast path for the case that
>> +         * an interrupt is already pending.
>>           */
>> -        if (set->latch && !set->latch->is_set)
>> +        old_mask = pg_atomic_read_u32(MyPendingInterrupts);
>> +        already_pending = ((old_mask & set->interrupt_mask) != 0);
>> +
>> +        if (!already_pending)
>>          {
> 
> 
> Hm, I think this may be incorrect - what if there is *some* overlap between
> MyPendingInterrupts and set->interrupt_mask, but they're not equal?

I don't see the problem. In that case, already_pending will be set to 
'true', as it should.

>> +/*
>> + * Test an interrupt flag (of flags).
>> + */
>> +static inline bool
>> +IsInterruptPending(uint32 interruptMask)
>> +{
>> +    return (pg_atomic_read_u32(MyPendingInterrupts) & interruptMask) != 0;
>> +}
> 
> 
> Do we need to care about memory ordering?

Hmm, I think not. If you imagine using this with WaitInterrupt(), the 
WaitInterrupt works as a synchronization point. If the interrupt was 
just set by another procss, but this function returns a stale "false", 
the next WaitInterrupt() will return without sleeping. No other backend 
should be clearing our pending interrupt bits, so a stale "true" is not 
possible. I'll add a comment.

>> There's still a general-purpose INTERRUPT_GENERAL interrupt that is
>> multiplexed for many different purposes in different processes, for
>> example to wake up the walwriter when it has work to do, and to wake
>> up processes waiting on a condition variable. The common property of
>> those uses is that there's some other flag or condition that is
>> checked on each wakeup, the wakeup interrupt itself means merely that
>> something interesting might've happened.
> 
> I guess extensions that just need to use INTERRUPT_GENERAL, given that new
> interrupt reasons can't be dynamically registered?

Correct. I plan to implement a dynamic interrupt allocation mechanism 
for extensions, but it's not implemented yet.

(btw I renamed INTERRUPT_GENERAL to INTERRUPT_WAIT_WAKEUP)


>> +    if (ConsumeInterrupt(INTERRUPT_CONFIG_RELOAD))
>>      {
>>          int            autovacuum_max_workers_prev = autovacuum_max_workers;
>>  
>> -        ConfigReloadPending = false;
>>          ProcessConfigFile(PGC_SIGHUP);
>>  
>>          /* shutdown requested in config file? */
>> @@ -771,11 +774,11 @@ ProcessAutoVacLauncherInterrupts(void)
>>      }
>>  
>>      /* Process barrier events */
>> -    if (ProcSignalBarrierPending)
>> +    if (IsInterruptPending(INTERRUPT_BARRIER))
>>          ProcessProcSignalBarrier();
> 
> 
> Why do we have some code immediately consuming and others just checking if an
> interrupt is pending?

I was just sloppy. Now it's consistent: ProcessInterrupts() now always 
clears the interrupt before calling the handler function, and the 
handler function does not check or clear the interrupt.

>> @@ -572,10 +574,18 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
>>              cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
>>          }
>>  
>> -        (void) WaitInterrupt(INTERRUPT_GENERAL,
>> -                             WL_INTERRUPT | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
>> -                             cur_timeout * 1000L /* convert to ms */ ,
>> -                             WAIT_EVENT_CHECKPOINTER_MAIN);
>> +        (void) WaitInterrupt(
>> +            /* these are handled in the main loop */
>> +            INTERRUPT_GENERAL |     /* checkpoint requested */
> 
> 
> Why is a checkpoint request done via GENERAL? Seems that's specifically one
> that we do *not* want to absorb via CFI?

It works fine because there's a separate shared memory flag that's set 
when checkpoint is requested. We check that flag in the main loop, even 
if the interrupt is absorbed.

> 
>> @@ -947,6 +953,7 @@ void
>>  StandbyDeadLockHandler(void)
>>  {
>>      got_standby_deadlock_timeout = true;
>> +    RaiseInterrupt(INTERRUPT_GENERAL);
>>  }
>>  
>>  /*
>> @@ -956,6 +963,7 @@ void
>>  StandbyTimeoutHandler(void)
>>  {
>>      got_standby_delay_timeout = true;
>> +    RaiseInterrupt(INTERRUPT_GENERAL);
>>  }
>>  
>>  /*
>> @@ -965,6 +973,7 @@ void
>>  StandbyLockTimeoutHandler(void)
>>  {
>>      got_standby_lock_timeout = true;
>> +    RaiseInterrupt(INTERRUPT_GENERAL);
>>  }
> 
> 
> Why are these using INTERRUPT_GENERAL?

Could be changed for sure, but this works too. These are quite localized.

>> +
>> +    if (ConsumeInterrupt(INTERRUPT_WALSND_INIT_STOPPING))
>> +        ProcessWalSndInitStopping();
> 
> 
> Huh. So ProcessWalSndInitStopping() raises different inerrupts to actually
> exit and thus relies on being called first? That seems rather clunky.

I don't know about the "being called first" part, but yeah this is 
clunky. I had hard time understanding the logic and I'm still not 100% I 
got it right to be honest. Some further cleanup for readability would be 
nice..

[1] 
https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi

- Heikki
Attachment

pgsql-hackers by date:

Previous
From: surya poondla
Date:
Subject: Re: Add comments about fire_triggers argument in ri_triggers.c
Next
From: Peter Smith
Date:
Subject: Re: [WIP]Vertical Clustered Index (columnar store extension) - take2