Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
> Now the documentation for WSAEventSelect says "The FD_WRITE network
> event is handled slightly differently. An FD_WRITE network event is
> recorded when a socket is first connected with a call to the connect,
> ConnectEx, WSAConnect, WSAConnectByList, or WSAConnectByName function or
> when a socket is accepted with accept, AcceptEx, or WSAAccept function
> and then after a send fails with WSAEWOULDBLOCK and buffer space becomes
> available. Therefore, an application can assume that sends are possible
> starting from the first FD_WRITE network event setting and lasting until
> a send returns WSAEWOULDBLOCK. After such a failure the application will
> find out that sends are again possible when an FD_WRITE network event is
> recorded and the associated event object is set."
> But while PQconnectPoll does connect() before setting connection status
> to CONNECTION_STARTED and returns PGRES_POLLING_WRITING so the FD_WRITE
> eventually happens, it does not do any writes in the code block that
> switches to CONNECTION_MADE and PGRES_POLLING_WRITING. That means
> FD_WRITE event is never recorded as per quoted documentation.
Ah-hah! Great detective work.
> Then what remains is why it works in libpq. If you look at
> pgwin32_select which is eventually called by libpq code, it actually
> handles the situation by trying empty WSASend on any socket that is
> supposed to wait for FD_WRITE and only calling the
> WaitForMultipleObjectsEx when WSASend failed with WSAEWOULDBLOCK, when
> the WSASend succeeds it immediately returns ok.
> So I did something similar in attached for WaitEventSetWaitBlock() and
> it indeed solves the issue my windows test machine. Now the
> coding/placement probably isn't the best (and there are no comments),
> but maybe somebody will find proper place for this now that we know the
> cause.
Yeah, I'm afraid we had better do something more or less like this.
It's interesting to speculate about whether WaitEventSet could keep
additional state that would avoid the need for a dummy send() every
time, but that seems like material for new development not a bug fix.
In any case, a quick review of the code suggests that there are no
performance-critical places where we wait for WL_SOCKET_WRITEABLE
unless we've already detected that we have to wait for the network;
so the dummy send() isn't going to do anything except consume a
few more CPU cycles before we get to the point of blocking. It may
not be worth worrying about.
I'll add some comments and push this. Does everyone agree that
this had better get back-patched, too?
regards, tom lane