Re: Using WaitEventSet in the postmaster - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Using WaitEventSet in the postmaster
Date
Msg-id 20221202014022.j4hwmjnysasdg5yn@awork3.anarazel.de
Whole thread Raw
In response to Using WaitEventSet in the postmaster  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Using WaitEventSet in the postmaster  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hi,

On 2022-12-02 10:12:25 +1300, Thomas Munro wrote:
> Here's a work-in-progress patch that uses WaitEventSet for the main
> event loop in the postmaster

Wee!


> with a latch as the wakeup mechanism for "PM signals" (requests from
> backends to do things like start a background worker, etc).

Hm - is that directly related? ISTM that using a WES in the main loop, and
changing pmsignal.c to a latch are somewhat separate things?

Using a latch for pmsignal.c seems like a larger lift, because it means that
all of latch.c needs to be robust against a corrupted struct Latch.


> In order to avoid adding a new dependency on the contents of shared
> memory, I introduced SetLatchRobustly() that will always use the slow
> path kernel wakeup primitive, even in cases where SetLatch() would
> not.  The idea here is that if one backend trashes shared memory,
> others backends can still wake the postmaster even though it may
> appear that the postmaster isn't waiting or the latch is already set.

Why is that a concern that needs to be addressed?


ISTM that the important thing is that either a) the postmaster's latch can't
be corrupted, because it's not shared with backends or b) struct Latch can be
overwritten with random contents without causing additional problems in
postmaster.

I don't think b) is the case as the patch stands. Imagine some process
overwriting pm_latch->owner_pid. That'd then break the SetLatch() in
postmaster's signal handler, because it wouldn't realize that itself needs to
be woken up, and we'd just signal some random process.


It doesn't seem trivial (but not impossible either) to make SetLatch() robust
against arbitrary corruption. So it seems easier to me to just put the latch
in process local memory, and do a SetLatch() in postmaster's SIGUSR1 handler.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Using AF_UNIX sockets always for tests on Windows
Next
From: Andres Freund
Date:
Subject: Re: Using AF_UNIX sockets always for tests on Windows