On Thu, Feb 26, 2026 at 4:09 AM Chao Li <li.evan.chao@gmail.com> wrote:
Hi Chao,
> I just reviewed v4 again and got a few more comments:
>
> 1. This patch only set the global proc_die_sender_pid/uid to 0 at startup, then assign values to them upon receiving
SIGTERM,and never reset them, which assumes a process must die upon SIGTERM. Is the assumption true? I guess not. If a
processreceives SIGTERM and not die immediately, then die for other reason, then it may report a misleading PID and
UID.
Hmm, I'm not sure I follow. If we receive SIGTERM and not die immediately
(for whatever reason), then two scenarios can happen as far as I'm concerned:
* another SIGTERM comes in from the same or different uid/pid and it wll be
reported properly
* different SIGKILL, but in this case we won't report UID/PID at all
am I missing something or do You have any particular scenario in mind?
The flow will be wrapper_handler()->die()->SetLatch()->..->directyl to
err reporting facilities.
> 2.
> syncrpe.c uses errhint to print PID and UID, and postgres.c uses errdetail. We should keep consistency, maybe all use
errhint.
Right, let's make it that way.
> 3.
> ```
> @@ -319,7 +323,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
> QueryCancelPending = false;
> ereport(WARNING,
> (errmsg("canceling wait for synchronous replication due to user request"),
> - errdetail("The transaction has already committed locally, but might not have
beenreplicated to the standby.")));
> + errdetail("The transaction has already committed locally, but might not have
beenreplicated to the standby."),
> + proc_die_sender_pid == 0 ? 0 :
> + errhint("Signal sent by PID %d, UID %d.",
> + proc_die_sender_pid, proc_die_sender_uid)
> + ));
> SyncRepCancelWait();
> break;
> }
> ```
>
> I don’t think the query cancel case relates to SIGTERM, so we don’t need to log PID and UID here.
Right, it was superfluous.
v5 attached.
-J.