Re: Dangling Client Backend Process - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Dangling Client Backend Process
Date
Msg-id 20151030144635.GA6064@alap3.anarazel.de
Whole thread Raw
In response to Re: Dangling Client Backend Process  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Dangling Client Backend Process  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2015-10-30 09:48:33 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Hmm.  ProcessInterrupts() signals some FATAL errors while the
> > connection is idle, and rumor has it that that works: the client
> > doesn't immediately read the FATAL error, but the next time it sends a
> > query, it tries to read from the connection and sees the FATAL error
> > at that time.  I wonder why that's not working here.
>
> A likely theory is that the kernel is reporting failure to libpq's
> send() because the other side of the connection is already gone.
> This would be timing-dependent of course.

Looking at a strace psql over unix socket is actually receiving the
error message:
recvfrom(3, "E\0\0\0lSFATAL\0C57P01\0Mterminating "..., 16384, 0, NULL, NULL) = 109
but psql does print:
server closed the connection unexpectedly

it happens to work over localhost:
FATAL:  57P01: terminating connection due to unexpected postmaster exit
LOCATION:  secure_read, be-secure.c:170
server closed the connection unexpectedly      This probably means the server terminated abnormally      before or
whileprocessing the request.
 

the problem seems to be the loop eating all the remaining input:
void
pqHandleSendFailure(PGconn *conn)
{/* * Accept any available input data, ignoring errors.  Note that if * pqReadData decides the backend has closed the
channel,it will close * our side of the socket --- that's just what we want here. */while (pqReadData(conn) > 0)     /*
loopuntil no more data readable */ ;
 

after the first pqReadData() there's no remaining input and thus the
second call to pqReadData()'s pqsecure_read reads 0 and this is hit:/* * OK, we are getting a zero read even though
select()says ready. This * means the connection has been closed.  Cope. */
 
definitelyEOF:printfPQExpBuffer(&conn->errorMessage,                  libpq_gettext(                            "server
closedthe connection unexpectedly\n"               "\tThis probably means the server terminated abnormally\n"
             "\tbefore or while processing the request.\n"));
 

adding a parseInput(conn) into the loop yields the expected
FATAL:  57P01: terminating connection due to unexpected postmaster exit

Is there really any reason not to do that?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Did the "Full-text search in PostgreSQL in milliseconds" patches land?
Next
From: Alexander Korotkov
Date:
Subject: Re: Move PinBuffer and UnpinBuffer to atomics