Re: Some 9.5beta2 backend processes not terminating properly? - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Some 9.5beta2 backend processes not terminating properly?
Date
Msg-id CAA4eK1LFEwohKjuDAfAdxqztq4rTk4-PXp0q73rHUmU0xDrkaQ@mail.gmail.com
Whole thread Raw
In response to Re: Some 9.5beta2 backend processes not terminating properly?  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: Some 9.5beta2 backend processes not terminating properly?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Jan 2, 2016 at 5:02 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 2016-01-02 12:05, Amit Kapila wrote:
I am also able to reproduce now. The reason was that I didn't have
latest .Net framework and Visual Studio, which is must for the recent
version of Npgsql.

One probable reason of the problem seems to be that now for windows, we
are emulating non-blocking behaviour by setting pgwin32_noblock = true
which makes function pgwin32_recv() return EWOULDBLOCK and it would
wait using WaitLatchOrSocket() instead of pgwin32_waitforsinglesocket().
There are some differences in the way both the API's (WaitLatchOrSocket()
and pgwin32_waitforsinglesocket()) do wait, now may be that is the reason
for this behaviour.  One thing I have tried is that if I don't
set pgwin32_noblock
in secure_raw_read(), then this problem won't occur which lead to above
reasoning.  I am still investigating.


Well, without pgwin32_noblock = true we never enter the code block which calls WaitLatchOrSocket and hangs as in my testing this was always called because of EWOULDBLOCK.


What I wanted to say is that the handling of socket closure is not
same in WaitLatchOrSocket() and pgwin32_waitforsinglesocket()
due to which this problem can arise and it seems that is the
right line of direction to pursue.  I have found that in WaitLatchOrSocket(),
even when the socket is closed, we remember the result as
WL_SOCKET_READABLE and again tries to wait whereas the
same is handled properly in pgwin32_waitforsinglesocket().  If we
remember the closed socket event and then take appropriate action,
then this problem won't happen.  Attached patch which by no-means
a complete fix shows what I wanted to say and after this the problem
mentioned by Shay doesn't happen, although I get LOG message
which is due to the reason that proper handling for socket closure
needs to be done in this path.  This patch is based on the code
after commit 387da18874afa17156ee3af63766f17efb53c4b9.  I
will do testing and refine the fix based on HEAD later as I am done
for the today.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Release notes of 9.0~9.3 mentioning recovery_min_apply_delay incorrectly
Next
From: Andres Freund
Date:
Subject: Re: Some 9.5beta2 backend processes not terminating properly?