Re: Network failure may prevent promotion - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Network failure may prevent promotion
Date
Msg-id ea7b8012-b739-436e-afe4-be0b2f69b304@iki.fi
Whole thread Raw
In response to Re: Network failure may prevent promotion  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Network failure may prevent promotion
Re: Network failure may prevent promotion
List pgsql-hackers
On 23/01/2024 10:24, Kyotaro Horiguchi wrote:
> Thank you for looking this!
> 
> At Tue, 23 Jan 2024 15:07:10 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in
>> Regarding the patch, here are the review comments.
>>
>> +/*
>> + * Is current process a wal receiver?
>> + */
>> +bool
>> +IsWalReceiver(void)
>> +{
>> + return WalRcv != NULL;
>> +}
>>
>> This looks wrong because WalRcv can be non-NULL in processes other
>> than walreceiver.
> 
> Mmm. Sorry for the silly mistake. We can use B_WAL_RECEIVER
> instead. I'm not sure if the new function IsWalReceiver() is
> required. The expression "MyBackendType == B_WAL_RECEIVER" is quite
> descriptive. However, the function does make ProcessInterrupts() more
> aligned regarding process types.

There's an existing AmWalReceiverProcess() macro too. Let's use that.

(See also 
https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0%40iki.fi 
for refactoring in this area)

Here's a patch set summarizing the changes so far. They should be 
squashed, but I kept them separate for now to help with review:

1. revert the revert of 728f86fec6.
2. your walrcv_shutdown_deblocking_v2-2.patch
3. Also replace libpqrcv_PQexec() and libpqrcv_PQgetResult() with the 
wrappers from libpq-be-fe-helpers.h
4. Replace IsWalReceiver() with AmWalReceiverProcess()

>> - pqsignal(SIGTERM, SignalHandlerForShutdownRequest); /* request shutdown */
>> + pqsignal(SIGTERM, WalRcvShutdownSignalHandler); /* request shutdown */
>>
>> Can't we just use die(), instead?
> 
> There was a comment explaining the problems associated with exiting
> within a signal handler;
> 
> - * Currently, only SIGTERM is of interest.  We can't just exit(1) within the
> - * SIGTERM signal handler, because the signal might arrive in the middle of
> - * some critical operation, like while we're holding a spinlock.  Instead, the
> 
> And I think we should keep the considerations it suggests. The patch
> removes the comment itself, but it does so because it implements our
> standard process exit procedure, which incorporates points suggested
> by the now-removed comment.

die() doesn't call exit(1). Unless DoingCommandRead is set, but it never 
is in the walreceiver. It looks just like the new 
WalRcvShutdownSignalHandler() function. Am I missing something?

Hmm, but doesn't bgworker_die() have that problem with exit(1)ing in the 
signal handler?

I also wonder if we should replace SignalHandlerForShutdownRequest() 
completely with die(), in all processes? The difference is that 
SignalHandlerForShutdownRequest() uses ShutdownRequestPending, while 
die() uses ProcDiePending && InterruptPending to indicate that the 
signal was received. Or do some of the processes want to check for 
ShutdownRequestPending only at specific places, and don't want to get 
terminated at the any random CHECK_FOR_INTERRUPTS()?

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: make dist using git archive
Next
From: Ashutosh Bapat
Date:
Subject: Re: [17] CREATE SUBSCRIPTION ... SERVER