Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Date
Msg-id 20220411225049.6vi3n4rgmegkwt7v@alap3.anarazel.de
Whole thread Raw
In response to Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
List pgsql-hackers
Hi,

On 2022-04-12 10:33:28 +1200, Thomas Munro wrote:
> Huh.  I wouldn't have started a separate thread for this if I'd
> realised I was getting close to the cause of the CI failure...

:)


> Instead of bothering to create N different XXXPending variables for
> the different conflict "reasons", I used an array.  Other than that,
> it's much like existing examples.

It kind of bothers me that each pending conflict has its own external function
call. It doesn't actually cost anything, because it's quite unlikely that
there's more than one pending conflict.  Besides aesthetically displeasing,
it also leads to an unnecessarily large amount of code needed, because the
calls to RecoveryConflictInterrupt() can't be merged...

But that's perhaps best fixed separately.


What might actually make more sense is to just have a bitmask or something?


> The existing use of the global variable RecoveryConflictReason seems a
> little woolly.  Doesn't it get clobbered every time a signal arrives,
> even if we determine that there is no conflict?  Not sure why that's
> OK, but anyway, this patch always sets it together with
> RecoveryConflictPending = true.

Yea. It's probably ok, kind of, because there shouldn't be multiple
outstanding conflicts with very few exceptions (deadlock and buffer pin). And
it doesn't matter that much which of those gets handled. And we'll retry
again. But brrr.


> +/*
> + * Check one recovery conflict reason.  This is called when the corresponding
> + * RecoveryConflictInterruptPending flags is set.  If we decide that a conflict
> + * exists, then RecoveryConflictReason and RecoveryConflictPending will be set,
> + * to be handled later in the same invocation of ProcessInterrupts().
> + */
> +static void
> +ProcessRecoveryConflictInterrupt(ProcSignalReason reason)
> +{
>      /*
>       * Don't joggle the elbow of proc_exit
>       */
>      if (!proc_exit_inprogress)
>      {
> -        RecoveryConflictReason = reason;
>          switch (reason)
>          {
>              case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
> @@ -3084,9 +3094,9 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
>                      if (IsAbortedTransactionBlockState())
>                          return;
>  
> +                    RecoveryConflictReason = reason;
>                      RecoveryConflictPending = true;
>                      QueryCancelPending = true;
> -                    InterruptPending = true;
>                      break;
>                  }
>  
> @@ -3094,9 +3104,9 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
>                  /* FALLTHROUGH */
>  
>              case PROCSIG_RECOVERY_CONFLICT_DATABASE:
> +                RecoveryConflictReason = reason;
>                  RecoveryConflictPending = true;
>                  ProcDiePending = true;
> -                InterruptPending = true;
>                  break;
>  
>              default:
> @@ -3115,15 +3125,6 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
>          if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE)
>              RecoveryConflictRetryable = false;
>      }

It's pretty weird that we have all this stuff that we then just check a short
while later in ProcessInterrupts() whether they've been set.

Seems like it'd make more sense to throw the error in
ProcessRecoveryConflictInterrupt(), now that it's not in a a signal handler
anymore?


>  /*
> @@ -3147,6 +3148,22 @@ ProcessInterrupts(void)
>          return;
>      InterruptPending = false;
>  
> +    /*
> +     * Have we been asked to check for a recovery conflict?  Processing these
> +     * interrupts may result in RecoveryConflictPending and related variables
> +     * being set, to be handled further down.
> +     */
> +    for (int i = PROCSIG_RECOVERY_CONFLICT_FIRST;
> +         i <= PROCSIG_RECOVERY_CONFLICT_LAST;
> +         ++i)
> +    {
> +        if (RecoveryConflictInterruptPending[i])
> +        {
> +            RecoveryConflictInterruptPending[i] = false;
> +            ProcessRecoveryConflictInterrupt(i);
> +        }
> +    }

Hm. This seems like it shouldn't be in ProcessInterrupts(). How about checking
calling a wrapper doing all this if RecoveryConflictPending?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}
Next
From: Michael Paquier
Date:
Subject: Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory