Thread: Patch for Win32 blocking problem

Patch for Win32 blocking problem

From
Teodor Sigaev
Date:
Patch solves the problem with blocking backend in pgwin32_waitforsinglesocket()
when it tries to send something to stat collector.
Patch makes two thing:

1) pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx now called with
finite timeout (100ms) in case of FP_WRITE and UDP socket. If timeout occurs
then pgwin32_waitforsinglesocket() returns EINTR. Reason: As it follows from
tests (see below) process may sleep forever in WaitForMultipleObjectsEx in case
of infinite timeout.

2) pgwin32_send(): add loop around WSASend and pgwin32_waitforsinglesocket().
The reason is: for UDP socket, 'ok' result from pgwin32_waitforsinglesocket()
isn't guarantee that socket is still free, it can become busy again and
following WSASend call will fail with WSAEWOULDBLOCK error.

Note, situations above occur only on very high load and very rare. About 1 time
per several hours. Personally, I don't like 1) patch way, but I can't find
better solution.

To simulate the bug, I developed test suite
(http://www.sigaev.ru/misc/wintest.tgz). Test runs one 'collector' and several
(32 by defaults) clients, which send a lot of packets to collector. Socket
library is taken from pgsql directly. Installation & testing (under MinGW):
% tar xzvf wintest.tgz
% cd wintest
% make
% ./serveres
Archive contains two socket.c:
    socket.c.orig - as it in pgsql
    socket.c      - already patched
fprintf() calls are added to pgwin32_waitforsinglesocket() and in case of
socket.c.orig several clients never go out. Usually, it's needed 1-3 minutes to
reproduce. Test suite works harder than pgsql, and block occurs even on
uniprocessor box. It may be needed to increase number of clients to reliable
reproduce the bug.


Objections, comments, advices, suggestions?

I intend to commit patch to all affected branches today or tomorrow if there are
no objections or better ideas.


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                     WWW: http://www.sigaev.ru/


Attachment

Re: Patch for Win32 blocking problem

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
> Patch solves the problem with blocking backend in pgwin32_waitforsinglesocket()
> when it tries to send something to stat collector.

Adding the looping in pgwin32_send() seems clearly correct, since there
could be multiple processes trying to send to the collector at the same
time.  I find the proposed patch in pgwin32_waitforsinglesocket to be a
pretty ugly kluge though.  Are you sure it's needed given the other fix?
        regards, tom lane


Re: Patch for Win32 blocking problem

From
Teodor Sigaev
Date:
> time.  I find the proposed patch in pgwin32_waitforsinglesocket to be a
> pretty ugly kluge though.  Are you sure it's needed given the other fix?

Loop in pgwin32_send() doesn't prevent from infinite sleeping in 
WaitForMultipleObjectEx in pgwin32_waitforsinglesocket. I'm not a Windows guru 
at all, and I'm not like that part of patch too. I can't find better solution...

May be that way (untested):

if ( isUDP && (what & FP_WRITE) )
for(;;) {r = WaitForMultipleObjects(100 ms);if ( r == WAIT_TIMEOUT ) {    r == WSASend( sero packet ); /* see comments
inpgwin32_select() */    [ analyze result of WSASend:        * if success then return 1        * WSAEWOULDBLOCK -
continueloop        * SOCKET_ERROR - return 0    ]} else    break;
 
}    

I'm not sure that is more clean way...


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Patch for Win32 blocking problem

From
Teodor Sigaev
Date:
Attached patch implements that idea.

> May be that way (untested):
>
> if ( isUDP && (what & FP_WRITE) )
> for(;;) {
>     r = WaitForMultipleObjects(100 ms);
>     if ( r == WAIT_TIMEOUT ) {
>         r == WSASend( sero packet ); /* see comments in pgwin32_select()
>  */
>         [ analyze result of WSASend:
>             * if success then return 1
>             * WSAEWOULDBLOCK - continue loop
>             * SOCKET_ERROR - return 0
>         ]
>     } else
>         break;
> }

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment