Re: killing pg_dump leaves backend process - Mailing list pgsql-hackers

From Greg Stark
Subject Re: killing pg_dump leaves backend process
Date
Msg-id CAM-w4HP5=EEHtM6-FgV2=mCohYK6RJzgsvSAg7GM20KuNNKwFw@mail.gmail.com
Whole thread
In response to Re: killing pg_dump leaves backend process  (Christopher Browne <cbbrowne@gmail.com>)
List pgsql-hackers

I think this is utterly the won't way to think about this.

TCP is designed to be robust against transient network outages. They are *not* supposed to cause disconnections. The purpose of keepalives is to detect connections that are still valid live connections that are stale and the remote end is not longer present for.

Keepalives that trigger on the timescale of less than several times the msl are just broken and make TCP unreliable. That means they cannot trigger in less than many minutes.

This case is one that should just work and should work immediately. From the users point of view when a client cleanly dies the kernel on the client is fully aware of the connection being closed and the network is working fine. The server should be aware the client has gone away *immediately*. There's no excuse for any polling or timeouts.

--
greg

On 10 Aug 2013 17:30, "Christopher Browne" <cbbrowne@gmail.com> wrote:
On Sat, Aug 10, 2013 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tatsuo Ishii <ishii@postgresql.org> writes:
>> I noticed pg_dump does not exit gracefully when killed.
>> start pg_dump
>> kill pg_dump by ctrl-c
>> ps x
>
>> 27246 ?        Ds    96:02 postgres: t-ishii dbt3 [local] COPY
>> 29920 ?        S      0:00 sshd: ishii@pts/5
>> 29921 pts/5    Ss     0:00 -bash
>> 30172 ?        Ss     0:00 postgres: t-ishii dbt3 [local] LOCK TABLE waiting
>
>> As you can see, after killing pg_dump, a backend process is (LOCK
>> TABLE waiting) left behind. I think this could be easily fixed by
>> adding signal handler to pg_dump so that it catches the signal and
>> issues a query cancel request.
>
> If we think that's a problem (which I'm not convinced of) then pg_dump
> is the wrong place to fix it.  Any other client would behave the same
> if it were killed while waiting for some backend query.  So the right
> fix would involve figuring out a way for the backend to kill itself
> if the client connection goes away while it's waiting.

This seems to me to be quite a bit like the TCP keepalive issue.

We noticed with Slony that if something ungraceful happens in the
networking layer (the specific thing noticed was someone shutting off
networking, e.g. "/etc/init.d/networking stop" before shutting down
Postgres+Slony), the usual timeouts are really rather excessive, on
the order of a couple hours.

Probably it would be desirable to reduce the timeout period, so that
the server could figure out that clients are incommunicado "reasonably
quickly."  It's conceivable that it would be apropos to diminish the
timeout values in postgresql.conf, or at least to recommend that users
consider doing so.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: killing pg_dump leaves backend process
Next
From: Andres Freund
Date:
Subject: Re: dynamic background workers, round two