Dealing with network-dead clients - Mailing list pgsql-hackers

From Oliver Jowett
Subject Dealing with network-dead clients
Date
Msg-id 421002A5.6090701@opencloud.com
Whole thread Raw
Responses Re: Dealing with network-dead clients  (Richard Huxton <dev@archonet.com>)
List pgsql-hackers
I'm currently trying to find a clean way to deal with network-dead 
clients that are in a transaction and holding locks etc.

The normal "client closes socket" case works fine. The scenario I'm 
worried about is when the client machine falls off the network entirely 
for some reason (ethernet problem, kernel panic, machine catches 
fire..). From what I can see, if the connection is idle at that point, 
the server won't notice this until TCP-level SO_KEEPALIVE kicks in, 
which by default takes over 2 hours on an idle connection. I'm looking 
for something more like a 30-60 second turnaround if the client is 
holding locks.

The options I can see are:

1) tweak TCP keepalive intervals down to a low value, system-wide
2) use (nonportable) setsockopt calls to tweak TCP keepalive settings on 
a per-socket basis.
3) implement an idle timeout on the server so that open transactions 
that are idle for longer than some period are automatically aborted.

(1) is very ugly because it is system-wide.
(2) is not portable.

Also I'm not sure how well extremely low keepalive settings behave.

(3) seems like a proper solution. I've searched the archives a bit and 
transaction timeouts have been suggested before, but there seems to be 
some resistance to them.

I was thinking along the lines of a SIGALRM-driven timeout that starts 
at the top of the query-processing loop when in a transaction and is 
cancelled when client traffic is received. I'm not sure exactly what 
should happen when the timeout occurs, though. Should it kill the entire 
connection, or just roll back the current transaction? If the connection 
stays alive, the fun part seems to be in avoiding confusing the client 
about the current transaction state.

Any suggestions on what I should do here?

-O


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Design notes for BufMgrLock rewrite
Next
From: Tzahi Fadida
Date:
Subject: Re: Query optimizer 8.0.1 (and 8.0)