Re: Some 9.5beta2 backend processes not terminating properly? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Some 9.5beta2 backend processes not terminating properly?
Date
Msg-id 20151230173734.hx7jj2fnwyljfqek@alap3.anarazel.de
Whole thread Raw
In response to Re: Some 9.5beta2 backend processes not terminating properly?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Some 9.5beta2 backend processes not terminating properly?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Some 9.5beta2 backend processes not terminating properly?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2015-12-30 12:30:43 -0500, Tom Lane wrote:
> Nor OS X.  Ugh.  My first thought was that ac1d7945f broke this, but
> that's only in HEAD not 9.5, so some earlier change must be responsible.

The backtrace in
http://archives.postgresql.org/message-id/CADT4RqBo79_0Vx%3D-%2By%3DnFv3zdnm_-CgGzbtSv9LhxrFEoYMVFg%40mail.gmail.com
seems to indicate that it's really WaitLatchOrSocket() not noticing the
socket is closed.

For a moment I had the theory that Port->sock might be invalid because
it somehow got closed. That'd then remove the socket from the waited-on
events, which would explain the behaviour. But afaics that's really only
possible via pq_init()'s on_proc_exit(socket_close, 0); And I can't see
how that could be reached.

FWIW, theif (sock == PGINVALID_SOCKET)    wakeEvents &= ~(WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE);
block in both latch implementations looks like a problem waiting to happen.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Some 9.5beta2 backend processes not terminating properly?
Next
From: Shay Rojansky
Date:
Subject: Re: Some 9.5beta2 backend processes not terminating properly?