Re: idle_in_transaction_timeout - Mailing list pgsql-hackers

From Andres Freund
Subject Re: idle_in_transaction_timeout
Date
Msg-id 20140630011944.GI26930@awork2.anarazel.de
Whole thread Raw
In response to Re: idle_in_transaction_timeout  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2014-06-29 19:51:05 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > Note that we're *inside* recv() here.
> 
> Well, actually, the recv() has probably failed with an EINTR at this
> point, or else it will when/if control returns.

Well, the point is that we'll be reentering ssl when sending the error
message, without having left it properly. I.e. we're already hitting the
problem you've described.

Sure, if we're not FATALing, it'll return EINTR after that. But that's
not really the point here.

I wonder if we should instead *not* set ImmediateInterruptOK = true and
do a CHECK_FOR_INTERRUPT in secure_read, after returning from
openssl. When the recv in my_sock_read() sets BIO_set_retry_read()
because the signal handler caused an EINTR to be returned openssl will
return control to the caller. Which then can do the
CHECK_FOR_INTERRUPT(), jumping out without having to deal with openssl.

Probably has some problems with annoying platforms like windows though
:(. Not sure how the signal emulation plays with recv() being
interrupted there.

> >> Uh, no, that would pretty much destroy the point of trying to report
> >> an error message at all.
> 
> > I only mean that we should do so in scenarios where we're currently
> > reading from the client. For die(), while we're reading from the client,
> > we're sending a message the client doesn't expect - and thus
> > unsurprisingly doesn't report.
> 
> This is nonsense.  The client will see the error as a response to its
> query.

Man. Don't be so quick to judge stuff you can't immediately follow or
find wrongly stated as 'nonsense'.

We're sending messages back to the client while the client isn't
expecting any from the server. E.g. evidenced by the fact that libpq's
pqParseInput3() doesn't treat error messages during that phase as
errors... It instead emits them via the notice hooks, expecting them to
be NOTICEs...
This e.g. means that there's no error message stored in
conn->errorMessage.

That happens to be somewhat ok in the case of FATALs because the
connection is killed afterwards so any confusion won't be long lived,
but you can't tell me that you'd e.g. find it acceptable to send an
ERROR there.

>  Of course, if it gets an EPIPE it might complain about that first
> -- but that would only be likely to happen with a multi-packet query
> string, at least over a TCP connection.
> 
> Experimenting with this on a RHEL6 box, I do see the "FATAL:  terminating
> connection due to administrator command" message if I SIGTERM a backend
> while idle and it's using a TCP connection; psql sees that as a response
> next time it issues a command.  I do get the EPIPE behavior over a
> Unix-socket connection, but I wonder if we shouldn't try to fix that.
> It would make sense to see if there's data available before complaining
> about the EPIPE.

Even over unix sockets the data seems to be read - courtesy of
pqHandleSendFailure():
sendto(3, "Q\0\0\0\16SELECT 1;\0", 15, MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe)
recvfrom(3, "E\0\0\0mSFATAL\0C57P01\0Mterminating "..., 16384, 0, NULL, NULL) = 110

The reason we don't print anything is that pqDropConnection(), which is
called by the second pqReadData() invocation in the loop, resets the
data positions:     conn->inStart = conn->inCursor = conn->inEnd = 0;

Moving the parseInput(conn) into the loop seems to "fix" it.


Haven't analyzed why, but if FATALs arrive during a query they're
printed twice. I've seen that before...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Deferring some AtStart* allocations?
Next
From: Andres Freund
Date:
Subject: Re: Deferring some AtStart* allocations?