Re: Add errdetail() with PID and UID about source of termination signal - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Add errdetail() with PID and UID about source of termination signal
Date
Msg-id 1ba07c75-cc4e-410c-9af1-076feeeb9594@dunslane.net
Whole thread Raw
In response to Re: Add errdetail() with PID and UID about source of termination signal  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Responses Re: Add errdetail() with PID and UID about source of termination signal
Re: Add errdetail() with PID and UID about source of termination signal
Re: Add errdetail() with PID and UID about source of termination signal
List pgsql-hackers
On 2026-02-26 Th 4:25 AM, Jakub Wartak wrote:
> On Thu, Feb 26, 2026 at 4:09 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi Chao,
>
>> I just reviewed v4 again and got a few more comments:
>>
>> 1. This patch only set the global proc_die_sender_pid/uid to 0 at startup, then assign values to them upon receiving
SIGTERM,and never reset them, which assumes a process must die upon SIGTERM. Is the assumption true? I guess not. If a
processreceives SIGTERM and not die immediately, then die for other reason, then it may report a misleading PID and
UID.
> Hmm, I'm not sure I follow. If we receive SIGTERM and not die immediately
> (for whatever reason), then two scenarios can happen as far as I'm concerned:
> * another SIGTERM comes in from the same or different uid/pid and it wll be
> reported properly
> * different SIGKILL, but in this case we won't report UID/PID at all
>
> am I missing something or do You have any particular scenario in mind?
> The flow will be wrapper_handler()->die()->SetLatch()->..->directyl to
> err reporting facilities.
>
>> 2.
>> syncrpe.c uses errhint to print PID and UID, and postgres.c uses errdetail. We should keep consistency, maybe all
useerrhint.
 
> Right, let's make it that way.
>
>> 3.
>> ```
>> @@ -319,7 +323,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
>>                          QueryCancelPending = false;
>>                          ereport(WARNING,
>>                                          (errmsg("canceling wait for synchronous replication due to user request"),
>> -                                        errdetail("The transaction has already committed locally, but might not
havebeen replicated to the standby.")));
 
>> +                                        errdetail("The transaction has already committed locally, but might not
havebeen replicated to the standby."),
 
>> +                                               proc_die_sender_pid == 0 ? 0 :
>> +                                                       errhint("Signal sent by PID %d, UID %d.",
>> +                                                               proc_die_sender_pid, proc_die_sender_uid)
>> +                                               ));
>>                          SyncRepCancelWait();
>>                          break;
>>                  }
>> ```
>>
>> I don’t think the query cancel case relates to SIGTERM, so we don’t need to log PID and UID here.
> Right, it was superfluous.
>
> v5 attached.
>


I'd kinda like to sneak this in for pg19, because I think it's useful. 
Here's a v6 that changes one or two things:


- changes the globals to sig_atomic_t

- in ProcessInterrupts, copies to local sender_pid/sender_uid, then 
zeros the globals before any ereport

- uses errdetail() for all the messages


Plus a few more cosmetic changes like consistent casing.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: EXPLAIN: showing ReadStream / prefetch stats
Next
From: Alexander Lakhin
Date:
Subject: Re: Changing the state of data checksums in a running cluster