Re: Weird failure with latches in curculio on v15 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Weird failure with latches in curculio on v15
Date
Msg-id 1645348.1675409043@sss.pgh.pa.us
Whole thread Raw
In response to Re: Weird failure with latches in curculio on v15  (Andres Freund <andres@anarazel.de>)
Responses Re: Weird failure with latches in curculio on v15
Re: Weird failure with latches in curculio on v15
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> Ugh, I think I might understand what's happening:

> The signal arrives just after the fork() (within system()). Because we
> have all our processes configure themselves as process group leaders,
> and we signal the entire process group (c.f. signal_child()), both the
> child process and the parent will process the signal. So we'll end up
> doing a proc_exit() in both. As both are trying to remove themselves
> from the same PGPROC etc entry, that doesn't end well.

Ugh ...

> I don't see how we can solve that properly as long as we use system().

... but I don't see how that's system()'s fault?  Doing the fork()
ourselves wouldn't change anything about that.

> A workaround for the back branches could be to have a test in
> StartupProcShutdownHandler() that tests if MyProcPid == getpid(), and
> not do the proc_exit() if they don't match. We probably should just do
> an _exit() in that case.

Might work.

> OTOH, the current approach only works on systems with setsid(2) support,
> so we probably shouldn't rely so hard on it anyway.

setsid(2) is required since SUSv2, so I'm not sure which systems
are of concern here ... other than Redmond's of course.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Weird failure with latches in curculio on v15
Next
From: Thomas Munro
Date:
Subject: Re: Weird failure with latches in curculio on v15