Andres Freund <andres@anarazel.de> writes:
> On 2020-09-09 16:09:00 -0400, Tom Lane wrote:
>> We could call it startup_packet_die or something?
> Yea, I think that'd be good.
I'll make it so.
>> We see backends going through this code on a very regular basis in the
>> buildfarm, but complete hangs are rare as can be. I think you
>> overestimate the severity of the problem.
> I don't think the BF exercises the problmetic paths to a significant
> degree. It's mostly local socket connections, and where not it's
> localhost. There's no slow DNS, no more complicated authentication
> methods, no packet loss. How often do we ever actually end up even
> getting close to any of the paths but immediate shutdowns?
Since we're talking about quickdie(), immediate shutdown/crash restart
is exactly the case of concern, and the buildfarm exercises it all the
time.
> And in the
> SIGQUIT path, how often do we end up in the SIGKILL path, masking
> potential deadlocks?
True, we can't really tell that. I wonder if we should make the
postmaster emit a log message when it times out and goes to SIGKILL.
After a few months we could scrape the buildfarm logs and get a
pretty good handle on it.
regards, tom lane