Re: SIGQUIT handling, redux - Mailing list pgsql-hackers

From Robert Haas
Subject Re: SIGQUIT handling, redux
Date
Msg-id CA+TgmoZc6QQoFeWJUj0c5e7bqhMaq=LnwTSFtUin-euz-u1HZw@mail.gmail.com
Whole thread Raw
In response to Re: SIGQUIT handling, redux  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: SIGQUIT handling, redux  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Sep 9, 2020 at 10:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> bgworker_die (SIGTERM)
>
> Calls ereport(FATAL).  This is surely not any safer than, say,
> quickdie().  No, it's worse, because at least that won't try
> to go out via proc_exit().

I think bgworker_die() is pretty much a terrible idea. Every
background worker I've written has actually needed to use
CHECK_FOR_INTERRUPTS(). I think that the only way this could actually
be safe is if you have a background worker that never uses ereport()
itself, so that the ereport() in the signal handler can't be
interrupting one that's already happening. This seems unlikely to be
the normal case, or anything close to it. Most background workers
probably are shared-memory connected and use a lot of PostgreSQL
infrastructure and thus ereport() all over the place.

Now what to do about it I don't know exactly, but it would be nice to
do something.

> StandbyDeadLockHandler (from SIGALRM)
> StandbyTimeoutHandler (ditto)
>
> Calls CancelDBBackends, which just for starters tries to acquire
> an LWLock.  I think the only reason we've gotten away with this
> for this long is the high probability that by the time either
> timeout fires, we're going to be blocked on a semaphore.

Yeah, I'm not sure these are so bad. In fact, in the deadlock case, I
believe the old coding was designed to make sure we *had to* be
blocked on a semaphore, but I'm not sure whether that's still true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Greg Steiner
Date:
Subject: Re: BUG #15858: could not stat file - over 4GB
Next
From: Robert Haas
Date:
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."