Home > mailing lists

Re: Parallel query vs smart shutdown and Postmaster death - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: Parallel query vs smart shutdown and Postmaster death
Date	March 18, 2019 06:55:02
Msg-id	CA+hUKG+7m3hvhf8LBRftnpX6FmYa282XtgFhjw=aLd1RBxi-Bg@mail.gmail.com Whole thread Raw
In response to	Re: Parallel query vs smart shutdown and Postmaster death (Arseny Sher <a.sher@postgrespro.ru>)
Responses	Re: Parallel query vs smart shutdown and Postmaster death
List	pgsql-hackers

Tree view

On Sun, Mar 17, 2019 at 5:53 PM Arseny Sher <a.sher@postgrespro.ru> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
>
> > 1.  Why does pmdie()'s SIGTERM case terminate parallel workers
> > immediately?  That breaks aborts running parallel queries, so they
> > don't get to end their work normally.
> > 2.  Why are new parallel workers not allowed to be started while in
> > this state?  That hangs future parallel queries forever, so they don't
> > get to end their work normally.
> > 3.  Suppose we fix the above cases; should we do it for parallel
> > workers only (somehow), or for all bgworkers?  It's hard to say since
> > I don't know what all bgworkers do.
>
> Attached patch fixes 1 and 2. As for the 3, the only other internal
> bgworkers currently are logical replication launcher and workers, which
> arguably should be killed immediately.

Hi Arseny,

Thanks for working on this!  Yes, it seems better to move forwards
rather than backwards, and fix this properly as you have done in this
patch.

Just a thought: instead of the new hand-coded loop you added in
pmdie(), do you think it would make sense to have a new argument
"exclude_class_mask" for SignalSomeChildren()?  If we did that, I
would consider renaming the existing parameter "target" to
"include_type_mask" to make it super clear what's going on.

> > Here's an initial sketch of a
> > patch like that: it gives you "ERROR:  parallel worker failed to
> > initialize" and "HINT:  More details may be available in the server
> > log." if you try to run a parallel query.
>
> Reporting bgworkers that postmaster is never going to start is probably
> worthwhile doing anyway to prevent potential further deadlocks like
> this. However, I think there is a problem in your patch: we might be in
> post PM_RUN states due to FatalError, not because of shutdown. In this
> case, we shouldn't refuse to run bgws in the future. I would also merge
> the check into bgworker_should_start_now.

Hmm, yeah, I haven't tested or double-checked in detail yet but I
think you're probably right about all of that.

-- 
Thomas Munro
https://enterprisedb.com

pgsql-hackers by date:

From: "Iwata, Aya"
Date: 18 March 2019, 06:03:04
Subject: RE: [PATCH] get rid of StdRdOptions, use individual binaryreloptions representation for each relation kind instead

From: Michael Paquier
Date: 18 March 2019, 07:02:35
Subject: Re: Offline enabling/disabling of data checksums

Re: Parallel query vs smart shutdown and Postmaster death - Mailing list pgsql-hackers

Previous

Next