Re: WL_SOCKET_ACCEPT fairness on Windows - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: WL_SOCKET_ACCEPT fairness on Windows
Date
Msg-id CA+hUKGKEOhqQKHzRLux0dfHs4eNcq0rr_1q=U4GB44+UytYeuQ@mail.gmail.com
Whole thread Raw
In response to Re: WL_SOCKET_ACCEPT fairness on Windows  (Andres Freund <andres@anarazel.de>)
Responses Re: WL_SOCKET_ACCEPT fairness on Windows  ("Jonathan S. Katz" <jkatz@postgresql.org>)
RE: WL_SOCKET_ACCEPT fairness on Windows  ("Wei Wang (Fujitsu)" <wangw.fnst@fujitsu.com>)
List pgsql-hackers
On Sat, Apr 1, 2023 at 2:35 PM Andres Freund <andres@anarazel.de> wrote:
> I wonder if we ought to bite the bullet and replace the use of
> WaitForMultipleObjects() with RegisterWaitForSingleObject() and then use
> GetQueuedCompletionStatus() to wait. The fairness issue here is a motivation,
> but the bigger one is that that'd get us out from under the
> MAXIMUM_WAIT_OBJECTS (IIRC 64) limit. Afaict that'd also allow us to read
> multiple notifications at once, using GetQueuedCompletionStatusEx().

Interesting.  So a callback would run in a system-managed thread, and
that would post a custom message in an IOCP for us to read, kinda like
the fake waitpid() thing?  Seems a bit gross and context-switchy but I
agree that the 64 event limit is also terrible.

> Medium term that'd also be a small step towards using readiness based APIs in
> a few places...

Yeah, that would be cool.

> > I think we could get the same effect as pgwin32_select() more cheaply
> > by doing the initial WaitForMultipleEvents() call with the caller's
> > timeout exactly as we do today, and then, while we have space,
> > repeatedly calling
> > WaitForMultipleEvents(handles=&events[last_signaled_event_index + 1],
> > timeout=0) until it reports timeout.
>
> That would make sense, and should indeed be reasonable cost-wise.

Cool.

> > I mention this now because I'm not sure whether to consider this an
> > 'open item' for 16, or merely an enhancement for 17.  I guess the
> > former, because someone might call that a new denial of service
> > vector.  On the other hand, if you fill up the listen queue for socket
> > 1 with enough vigour, you're also denying service to socket 1, so I
> > don't know if it's worth worrying about.  Opinions on that?
>
> I'm not sure either. It doesn't strike me as a particularly relevant
> bottleneck. And the old approach of doing more work for every single
> connection also made many connections worse, I think?

Alright, let's see if anyone else thinks this is worth fixing for 16.

> > diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
> > index f4123e7de7..cc7b572008 100644
> > --- a/src/backend/storage/ipc/latch.c
> > +++ b/src/backend/storage/ipc/latch.c
> > @@ -2025,6 +2025,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
> >        */
> >       cur_event = (WaitEvent *) &set->events[rc - WAIT_OBJECT_0 - 1];
> >
> > +loop:
> > +
>
> As far as I can tell, we'll never see WL_LATCH_SET or WL_POSTMASTER_DEATH. I
> think it'd look cleaner to move the body of if (cur_event->events & WL_SOCKET_MASK)
> into a separate function that we then also can call further down.

We could see them, AFAICS, and I don't see much advantage in making
that assumption?  Shouldn't we just shove it in a loop, just like the
other platforms' implementations?  Done in this version, which is best
viewed with git show --ignore-all-space.

> Seems like we could use returned_events to get nevents in the way you want it,
> without adding even more ++/-- to each of the different events?

Yeah.  This time I use reported_events.  I also fixed a maths failure:
I'd forgotten to use rc - WAIT_OBJECT_0, suggesting that CI never
reached the code.

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: WL_SOCKET_ACCEPT fairness on Windows
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: regression coverage gaps for gist and hash indexes