On Tue, Jan 17, 2023 at 11:24 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Another idea would be to teach the latch infrastructure itself to
> magically swap latch events to position 0. Latches are usually
> prioritised; it's only in this rare race case that they are not.
I liked that idea for a while, but I suspect it is not really possible
to solve the problem completely this way, because it won't work on
Windows (see below) and the race I described earlier is probably not
the only one. I think it must also be possible for poll() to ignore a
signal that becomes pending just as the system call begins and return
a socket fd that has also just become ready, without waiting (thus not
causing EINTR). Then the handler would run after we return to
userspace, we'd see only the socket event, and a later call would see
the latch event.
So I think we probably need something like the attached, which I was
originally trying to avoid.
Looking into all that made me notice a related problem on Windows.
There's an interesting difference between the implementation of
select() in src/backend/port/win32/socket.c and the Windows
implementation of WaitEventSetBlock() in latch.c. The latch.c code
only reports one event at a time, in event array order, because that's
WaitForMultipleObjects()'s contract and we expose that fairly
directly. The older socket.c code uses that only for wakeup, and then
it polls *all* sockets to be able to report more than one at a time.
I was careful to use a large array of output events to preserve the
existing round-robin servicing of multiple server sockets, but I see
now that that only works on Unix. On Windows, I suspect that one
socket receiving a fast enough stream of new connections could prevent
a second socket from ever being serviced. I think we might want to do
something about that.