Re: Use proc_exit() in WalRcvWaitForStartPosition - Mailing list pgsql-hackers

From Chao Li
Subject Re: Use proc_exit() in WalRcvWaitForStartPosition
Date
Msg-id 0D17D4E1-919F-4412-8EFE-BEB80211321D@gmail.com
Whole thread Raw
In response to Re: Use proc_exit() in WalRcvWaitForStartPosition  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Use proc_exit() in WalRcvWaitForStartPosition
List pgsql-hackers

> On Apr 10, 2026, at 14:16, Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Thu, Apr 9, 2026 at 10:09 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>>
>> On Thu, Apr 9, 2026 at 5:00 AM Andreas Karlsson <andreas@proxel.se> wrote:
>>>
>>> On 4/8/26 11:08 AM, Chao Li wrote:
>>>> While working on another patch, I happened to notice that WalRcvWaitForStartPosition() calls raw exit(1). I think
thisshould use proc_exit(1) instead, so that the normal cleanup machinery is not bypassed. 
>>>>
>>>> This tiny patch just replaces exit(1) with proc_exit(1) in WalRcvWaitForStartPosition().
>>>
>>> This looks likely to be correct since when we exit in WalReceiverMain()
>>> (on WALRCV_STOPPING and WALRCV_STOPPED) we call proc_exit(1). I feel we
>>> should exit the same way in WalRcvWaitForStartPosition() as we do in
>>> WalReceiverMain() and if not I would like a comment explaining why those
>>> two cases are different.
>>
>> +1
>
> +1
>
>
>> WalRcvWaitForStartPosition, WALRCV_STOPPING before entering wait loop
>> uses proc_exit(0) for WALRCV_STOPPING, while this path should probably
>> use proc_exit(0) as well (not proc_exit(1)), since the stop was a
>> requested shutdown, not an error. Using exit code 1 for a clean
>> stop-on-request seems inconsistent.
>
> The requested shutdown is handled in ShutdownWalRcv(), which sets the state to
> WALRCV_STOPPING and sends SIGTERM to the walreceiver.
>
> Although this might be considered a normal shutdown (suggesting exit code 0),
> when the walreceiver receives SIGTERM it exits via ereport(FATAL), resulting
> in exit code 1. In contrast, if it exits early in WalRcvWaitForStartPosition()
> due to the WALRCV_STOPPING state, it uses exit code 0, as you noted. So
> there seems to be some inconsistency in exit codes.
>
> That said, the exit code (0 vs 1) does not affect behavior, since
> the postmaster treats both as non-crash exits.
>
> For consistency, I would prefer using exit code 1 in proc_exit() in
> WalRcvWaitForStartPosition(), to match the ereport(FATAL) path. But I'm fine
> with other approaches as well.
>
> Also, the comment at the top of walreceiver.c may need updating:
>
> * Normal termination is by SIGTERM, which instructs the walreceiver to
> * exit(0). Emergency termination is by SIGQUIT; like any postmaster child
> * process, the walreceiver will simply abort and exit on SIGQUIT. A close
> * of the connection and a FATAL error are treated not as a crash but as
> * normal operation.
>
> Regards,
>
> --
> Fujii Masao

PFA v2 - updated header comment of walreceive.c. I tried to avoid mentioning the exact exit value in the comment, so I
justchanged “exit(0)” to “terminate”. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/





Attachment

pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: Daniil Davydov
Date:
Subject: Re: Fix bug with accessing to temporary tables of other sessions