Thread: Disconnecting and cancelling a statement

Disconnecting and cancelling a statement

From

Jeff Davis

Date:

06 September 2011, 23:07:30

I'm looking for a reliable way for a client to disconnect from a backend
such that any running query is terminated.

Right now, PostgreSQL doesn't seem to make an effort to detect a client
cancellation. For instance, if you do a "select pg_sleep(1000)" and then
kill -9 the client, the SELECT will remain running. That's not so much
of a problem for sleep, but if it's doing real work, then it's wasting a
lot of effort (and perhaps not terminating in any reasonable amount of
time).

And even if the client makes an effort to cancel and there are no major
network problems, then I still don't see a good method. Because the
cancellation request is sent out-of-band to the postmaster, then it's in
a race with the (asynchronous) query that you just sent. If the signal
reaches the backend before the query does, then the SIGINT becomes a
no-op (because it's still idle), and then the query arrives, and then
the client does PQfinish, the backend will still be alive doing a bunch
of needless work.

I have attached a simple C program that demonstrates the problem (must
be run from same host as PG because it uses SIGSTOP/SIGCONT to reproduce
race). After you run it, see how the "SELECT pg_sleep(1000)" is still
running, despite the client being disconnected.

There are two solutions that I see, neither of which look great:

1. Make a separate connection, and issue pg_terminate_backend() before
PQfinish. It works because a SIGTERM will not be a no-op on an idle
backend. This solution requires superuser privileges (not acceptable),
plus it's a little ugly.

2. Keep sending cancellation requests in a loop with a delay, consuming
input each time until PQisBusy() returns false. Obviously fairly ugly
and error prone, but is somewhat acceptable.

Any other ideas? There is no PQterminate, unforunately.
statement_timeout is not feasible, as the statement might legitimately
run for a very long time.

This is all compounded by the fact that terminating the backend directly
is no guarantee of proper shutdown, either:

http://archives.postgresql.org/pgsql-general/2009-03/msg00434.php

That means that there is no way to nicely and reliably shut down
postgresql from the client alone, nor from the server alone. The only
way is to send a SIGTERM to the backend *and* terminate the client
connection. Unless someone has a better idea?

Thoughts?

Regards,
    Jeff Davis

Attachment

client_disconnect.c

Re: Disconnecting and cancelling a statement

From

Craig Ringer

Date:

07 September 2011, 03:53:24

On 7/09/2011 10:00 AM, Jeff Davis wrote:
> I'm looking for a reliable way for a client to disconnect from a backend
> such that any running query is terminated.
>
> Right now, PostgreSQL doesn't seem to make an effort to detect a client
> cancellation. For instance, if you do a "select pg_sleep(1000)" and then
> kill -9 the client, the SELECT will remain running.

pg_sleep isn't a good test. In fact, Pg _does_ make an effort to detect
when a client dies, and will try to terminate the query. It does this
via explicit checks at various points, none of which are reached while
Pg is idling in a sleep() syscall. During more typical query processing
you'll usually find that a query gets terminated when the client dies.

Pg must find out when the client dies, though. If the client just goes
away - such as with a laptop on wifi that wanders out of range - it
won't know about it until it next attempts to send data to the client.
To address this, if you want reliable client dropout detection, you need
to enable tcp keepalives and set them to quite aggressive so the OS will
periodically test the connection for aliveness.
> And even if the client makes an effort to cancel and there are no major
> network problems, then I still don't see a good method. Because the
> cancellation request is sent out-of-band to the postmaster, then it's in
> a race with the (asynchronous) query that you just sent.

Yeah, this bit frustrates me too. Not only is the request async, but it
requires the establishment of a new full-featured database connection
first. That's a mess with pooling and sharding/clustering setups where
there's no guarantee the connection will go to the same host. It's also
expensive in terms of round trips, setup work on the server, and sheer
time taken. I found supporting query cancels to be a right PITA and was
never satisifed with the solution I ended up with in my (Java/PgJDBC) app.

I'd love to see Pg accept OOB cancel requests done via lightweight
connections that don't go through the whole setup process. If the server
sent a statement "cookie" when executing a statement, the client could
hang onto that and use it to issue a cancel for that statement and only
that statement by establishing a new connection to the server and
sending that cookie rather than the usual negotiation and auth process.
There'd be no need to go through full auth or even bother with SSL,
because it's a one-time random (or hash-based) code. Pooling systems
could send this to _all_ servers, or it could be prefixed with a server
identifier that helped poolers route it to the right server.

The same statement cookie could be used to support full connection-based
cancellation by passing it to a "pg_cancel_statement('blahcookieblah');"
function.

There's probably no safe way (short of a fully threaded backend - and I
did say "safe") to avoid the need for a new control connection and use a
single tcp connection for everything, though. Sending the cancel message
in-band via the regular connection will work for non-SSL connections,
since the server can periodically check for new input and process it
without risking blocking. This cannot work with SSL though, as there's
no way to tell if data waiting on the socket is a whole SSL message or
only a partial one that'll block waiting for new input when read. Even
were that problem worked around with a separate socket reader thread
that _can_ block, the backend still has to do periodic checks for input,
so it'd suffer from some of the same issues as the current approach.

> Any other ideas? There is no PQterminate, unforunately.
> statement_timeout is not feasible, as the statement might legitimately
> run for a very long time.

A real fix requires backend enhancements like unique statement
identifiers, IMO.

--
Craig Ringer

Re: Disconnecting and cancelling a statement

From

Jeff Davis

Date:

07 September 2011, 04:14:38

On Wed, 2011-09-07 at 14:46 +0800, Craig Ringer wrote:
> > Right now, PostgreSQL doesn't seem to make an effort to detect a client
> > cancellation. For instance, if you do a "select pg_sleep(1000)" and then
> > kill -9 the client, the SELECT will remain running.
>
> pg_sleep isn't a good test. In fact, Pg _does_ make an effort to detect
> when a client dies, and will try to terminate the query. It does this
> via explicit checks at various points, none of which are reached while
> Pg is idling in a sleep() syscall. During more typical query processing
> you'll usually find that a query gets terminated when the client dies.

pg_sleep is not merely a wrapper around the sleep system call, it does
call CHECK_FOR_INTERRUPTS() periodically. Also, you can see that
pg_sleep can be easily canceled if the signal arrives while the query is
actually running (try in psql, or try removing the SIGSTOP/SIGCONT
signals from the C code I attached to the first message).

Try with a large cartesian product and you should get the same problem.

> Pg must find out when the client dies, though. If the client just goes
> away - such as with a laptop on wifi that wanders out of range - it
> won't know about it until it next attempts to send data to the client.

How does it know, even on a good network connection, when the client
disconnects? I attached a reproducible case, so you should see what I'm
talking about.

> To address this, if you want reliable client dropout detection, you need
> to enable tcp keepalives and set them to quite aggressive so the OS will
> periodically test the connection for aliveness.

I'd be happy if it just detected a disconnect that the OS already knows
about, e.g. explicitly closing the socket.

> I'd love to see Pg accept OOB cancel requests done via lightweight
> connections that don't go through the whole setup process.

It does that for cancel (see PQcancel), but there is no equivalent for
termination.

>  If the server
> sent a statement "cookie" when executing a statement, the client could
> hang onto that and use it to issue a cancel for that statement and only
> that statement by establishing a new connection to the server and
> sending that cookie rather than the usual negotiation and auth process.
> There'd be no need to go through full auth or even bother with SSL,
> because it's a one-time random (or hash-based) code. Pooling systems
> could send this to _all_ servers, or it could be prefixed with a server
> identifier that helped poolers route it to the right server.

That's not too far from what's already done -- again, see the source for
PQcancel() and processCancelRequest().

Regards,
    Jeff Davis