Re: Possible explanation for Win32 stats regression test - Mailing list pgsql-hackers

From korry
Subject Re: Possible explanation for Win32 stats regression test
Date
Msg-id 1153154036.8500.12.camel@sakai.localdomain
Whole thread Raw
In response to Re: Possible explanation for Win32 stats regression test failures  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Possible explanation for Win32 stats regression test  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
<br /><blockquote type="CITE"><pre>
<font color="#000000">Ah-hah, I see it.  pgwin32_select() uses WaitForMultipleObjectsEx() with</font>
<font color="#000000">an event for the socket read-ready plus an event for signal arrival.</font>
<font color="#000000">It returns EINTR if the return code from WaitForMultipleObjectsEx shows</font>
<font color="#000000">the signal-arrival event as fired.  However, WaitForMultipleObjectsEx is</font>
<font color="#000000">defined to return the number of the *first* event in the list that is</font>
<font color="#000000">fired.  This means that if the socket comes read-ready at the same time</font>
<font color="#000000">the SIGALRM arrives, pgwin32_select() will ignore the signal, and it'll</font>
<font color="#000000">be processed by the subsequent pgwin32_recv().</font>

<font color="#000000">Now I don't know anything about the Windows scheduler, but I suppose it</font>
<font color="#000000">gives processes time quantums like everybody else does.  So "at the same</font>
<font color="#000000">time" really means "within the same scheduler clock tick", which is not</font>
<font color="#000000">so unlikely after all.  In short, before the just-committed patch, the</font>
<font color="#000000">Windows stats collector would fail if a stats message arrived during the</font>
<font color="#000000">same clock tick that its SIGALRM timeout expired.</font>

<font color="#000000">I think this explains not only the intermittent stats regression</font>
<font color="#000000">failures, but the reports we've heard from Merlin and others about the</font>
<font color="#000000">stats collector being unstable under load on Windows.  The heavier the</font>
<font color="#000000">load of stats messages, the more likely one is to arrive during the tick</font>
<font color="#000000">when the timeout expires.</font>
</pre></blockquote><br /> There's a second problem in pgwin32_waitforsinglesocket() that may be getting in your way.<br
/><br/> Inside of pgwin32_waitforsingleselect(), we create a kernel synchronization object (an Event) and associate
thatEvent with the socket.  When the TCP/IP stack detects interesting traffic on the socket, it signals the Event
object(interesting in this case is READ, WRITE, CLOSE, or ACCEPT, depending on the caller) and that wakes up the call
toWaitForMultipleObjectsEx().  <br /><br /> That all works fine, unless you have two or more sockets in the backend
(theimportant part is that src/include/port/win32.h #define's select() and other socket-related function - if you
compilea piece of network code that happens to #include port/win32.h, you'll get the pgwin32_xxx() versions).<br /><br
/>The problem is that, each time you go through pgwin32_waitforsinglesocket(), you tie the *same* kernel object
(waiteventis static) to each socket.  If you have more than one socket, you'll tie each socket to the same kernel
event. The kernel will signal that Event whenever interesting traffic appears on *any* of the sockets. The net effect
isthat, if you are waiting for activity on socket A, any activity on socket B will also awaken
WaitForMultipleObjects(). If you then try to read from socket A, you'll get an "operation would block error" because
nothinghappened on socket A.<br /><br /> The fix is pretty simple - just call WSAEventSelect( s, waitevent, 0 ) after
WaitForMultipleObjectsEx()returns.  That disassociates the socket from the Event (it will get re-associated the next
timepgwin32_waitforsingleselect() is called.  <br /><br /> I ran into this problem working on the PL/pgSQL debugger and
Ihaven't gotten around to posting a patch yet, sorry.<br /><br />             -- Korry (<a
href="mailto:korryd@enterprisedb.com">korryd@enterprisedb.com</a>)<br/><br /><br /> 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Proposed patch for contrib/cube
Next
From: Zdenek Kotala
Date:
Subject: TODO: Mark change-on-restart-only values in postgresql.conf