Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit() - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()
Date
Msg-id 20211011.161102.873686464224461228.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
At Mon, 11 Oct 2021 15:24:46 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/10/09 22:22, Bharath Rupireddy wrote:
> > Hi,
> > It looks like auxiliary processes will not have a valid MyBackendId as
> > they don't call InitPostgres() and SharedInvalBackendInit() unlike
> > backends. But the startup process (which is an auxiliary process) in
> > hot standby mode seems to be different in the way that it does have a
> > valid MyBackendId as SharedInvalBackendInit() gets called from
> > InitRecoveryTransactionEnvironment(). The SharedInvalBackendInit()
> > usually stores the MyBackendId in the caller's PGPROC structure i.e.
> > MyProc->backendId. The auxiliary processes (including the startup
> > process) usually register themselves in procsignal array with
> > ProcSignalInit(MaxBackends + MyAuxProcType + 1) unlike the backends
> > with ProcSignalInit(MyBackendId) after SharedInvalBackendInit() in
> > InitPostgres().
> > The problem comes when a postgres process wants to send a multiplexed
> > SIGUSR1 signal (probably using SendProcSignal()) to the startup
> > process after receiving its ProcSignal->psh_slot[] with its backendId
> > from the PGPROC (the postgres process can get the startup process
> > PGPROC structure from AuxiliaryPidGetProc()). Remember the startup
> > process has registered in the procsignal array with
> > ProcSignalInit(MaxBackends + MyAuxProcType + 1), not with the
> > ProcSignalInit(MyBackendId) like the backends did. So, the postgres
> > process, wanting to send SIGUSR1 to the startup process, refers to the
> > wrong ProcSignal->psh_slot[] and may not send the signal.
> > Is this inconsistency of MyBackendId for a startup process a problem
> > at all? Thoughts?
> > These are the following ways I think we can fix it, if at all some
> > other hacker agrees that it is actually an issue:
> > 1) Fix the startup process code, probably by unregistering the
> > procsignal array entry that was made with ProcSignalInit(MaxBackends +
> > MyAuxProcType + 1) in AuxiliaryProcessMain() and register with
> > ProcSignalInit(MyBackendId) immediately after SharedInvalBackendInit()
> > calculates the MyBackendId in with SharedInvalBackendInit() in
> > InitRecoveryTransactionEnvironment(). This seems risky to me as
> > unregistering and registering ProcSignal array involves some barriers
> > and during the registering and unregistering window, the startup
> > process may miss the SIGUSR1.
> > 2) Ensure that the process, that wants to send the startup process
> > SIGUSR1 signal, doesn't use the backendId from the startup process
> > PGPROC, in which case it has to loop over all the entries of
> > ProcSignal->psh_slot[] array to find the entry with the startup
> > process PID. It seems easier and less riskier but only caveat is that
> > the sending process shouldn't look at the backendId from auxiliary
> > process PGPROC, instead it should just traverse the entire proc signal
> > array to find the right slot.
> > 3) Add a comment around AuxiliaryPidGetProc() that says "auxiliary
> > processes don't have valid backend ids, so don't use the backendId
> > from the returned PGPROC".
> > (2) and (3) seem reasonable to me. Thoughts?

(I'm not sure how the trouble happens.)
2 and 3 looks like fixing inconsistency by another inconsistency.
I'm not sure 1 is acceptable.

> How about modifying SharedInvalBackendInit() so that it accepts
> BackendId as an argument and allocates the ProcState entry of
> the specified BackendId? That is, the startup process determines
> that its BackendId is "MaxBackends + MyAuxProcType (=StartupProcess) +
> 1"
> in AuxiliaryProcessMain(), and then it passes that BackendId to
> SharedInvalBackendInit() in InitRecoveryTransactionEnvironment().
> 
> Maybe you need to enlarge ProcState array so that it also handles
> auxiliary processes if it does not for now.

It seems to me that the backendId on startup process is used only for
vxid generation. Actually "make check-world" doesn't fail by skipping
SharedInvalBackendinit() (and disabling an assertion).

I thought that we could decouple vxid from backend ID (or sinval code)
by using pgprocno for vxid generation instead of backend ID. "make
check-world" doesn't fail with that change, too. (I don't think
"doesn't fail" ncecessarily mean that that change is correct, though),
but vxid gets somewhat odd after the change..

=# select distinct virtualxid from pg_locks;
 virtualxid 
------------
 
 116/1   # startup
 99/48   # backend 1
...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Added schema level support for publication.