Re: SIGQUIT handling, redux - Mailing list pgsql-hackers

From Tom Lane
Subject Re: SIGQUIT handling, redux
Date
Msg-id 112673.1599683437@sss.pgh.pa.us
Whole thread Raw
In response to Re: SIGQUIT handling, redux  (Andres Freund <andres@anarazel.de>)
Responses Re: SIGQUIT handling, redux  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2020-09-09 16:09:00 -0400, Tom Lane wrote:
>> We could call it startup_packet_die or something?

> Yea, I think that'd be good.

I'll make it so.

>> We see backends going through this code on a very regular basis in the
>> buildfarm, but complete hangs are rare as can be.  I think you
>> overestimate the severity of the problem.

> I don't think the BF exercises the problmetic paths to a significant
> degree. It's mostly local socket connections, and where not it's
> localhost. There's no slow DNS, no more complicated authentication
> methods, no packet loss. How often do we ever actually end up even
> getting close to any of the paths but immediate shutdowns?

Since we're talking about quickdie(), immediate shutdown/crash restart
is exactly the case of concern, and the buildfarm exercises it all the
time.

> And in the
> SIGQUIT path, how often do we end up in the SIGKILL path, masking
> potential deadlocks?

True, we can't really tell that.  I wonder if we should make the
postmaster emit a log message when it times out and goes to SIGKILL.
After a few months we could scrape the buildfarm logs and get a
pretty good handle on it.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: WIP: BRIN multi-range indexes
Next
From: Tom Lane
Date:
Subject: Re: SIGQUIT handling, redux