Re: zombie connections - Mailing list pgsql-hackers

From Tom Lane
Subject Re: zombie connections
Date
Msg-id 9611.1585929018@sss.pgh.pa.us
Whole thread Raw
In response to Re: zombie connections  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: zombie connections
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 3, 2020 at 10:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In general I think the threshold problem for a patch like this will be
>> "how do you keep the added overhead down".  As Robert noted upthread,
>> timeout.c is quite a bit shy of being able to handle timeouts that
>> persist across statements.  I don't think that there's any fundamental
>> reason it can't be improved, but it will need improvements.

> Why do we need that? If we're not executing a statement, we're
> probably trying to read() from the socket, and we'll notice if that
> returns 0 or -1. So it seems like we only need periodic checks while
> there's a statement  in progress.

Maybe you could build it that way, but I'm not sure it's a better way.

(1) You'll need to build a concept of a timeout that's not a statement
timeout, but nonetheless should be canceled exactly when the statement
timeout is (not before or after, unless you'd like to incur additional
setitimer() calls).   That's going to involve either timeout.c surgery
or fragile requirements on the callers.

(2) It only wins if a statement timeout is active, otherwise it makes
things worse, because then you need setitimer() at statement start
and end just to enable/disable the socket check timeout.  Whereas
if you just let a once-a-minute timeout continue to run, you don't
incur those kernel calls.

It's possible that we should run this timeout differently depending
on whether or not a statement timeout is active, though I'd prefer to
avoid such complexity if possible.  On the whole, if we have to
optimize just one of those cases, it should be the no-statement-timeout
case; with that timeout active, you're paying two setitimers per
statement anyway.

Anyway, the core problem with the originally-submitted patch was that
it was totally ignorant that timeout.c had restrictions it was breaking.
You can either fix the restrictions, or you can try to design around them,
but you've got to be aware of what that code can and can't do today.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: Should we add xid_current() or a int8->xid cast?
Next
From: Tom Lane
Date:
Subject: Re: adding partitioned tables to publications