Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0
Date
Msg-id CA+TgmoY72e1yYD+YnuWRaN7pTh4FCgXMPqkpNAWQp7yxek9JUQ@mail.gmail.com
Whole thread Raw
In response to Hang in pldebugger after git commit : 98a64d0  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
On Tue, Dec 13, 2016 at 11:19 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-12-12 16:46:38 +0900, Michael Paquier wrote:
>> OK, I think that I have spotted an issue after additional read of the
>> code. When a WSA event is used for read/write activity on a socket,
>> the same WSA event gets reused again and again. That's fine for
>> performance reasons
>
> It's actually also required to not loose events,
> i.e. correctness. Windows events are edge, not level triggered.

Windows might not lose the event, but PostgreSQL can certainly lose
the event.  In the Windows implementation of WaitEventSetBlock, we'll
ignore WL_SOCKET_READABLE if WL_LATCH_SET also fires.  Now, since the
event is edge-triggered[1], it never fires again and we hang.  Oops.
Even if we rewrote WaitEventSetBlock to always process all of the
events, there's nothing in the latch API that requires the caller to
read from the socket the first time WL_SOCKET_READABLE shows up.  The
caller is perfectly entitled to write:

if (rv & WL_LATCH_SET)  // do stuff
else if (rv & WL_SOCKET_READABLE)  // do other stuff

I suppose we could make a coding rule that the caller must not rely on
WL_SOCKET_READABLE being level-triggered, but I think that's likely to
lead to a stream of bugs that only show up on Windows, because most
people are used to the Linux semantics and will naturally write code
like the above which isn't robust in the face of an edge-triggered
event -- just as you yourself did in WaitEventSetBlock.

> The
> previous implementation wasn't correct.  So just resetting things ain't
> ok.

How was it incorrect?  The previous implementation *worked* and the
new one doesn't, at least in the case complained of here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms741576(v=vs.85).aspx



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Performance degradation in Bitmapscan (commit 75ae538bc3168bf44475240d4e0487ee2f3bb376)