Re: pgsql: Don't enter parallel mode when holding interrupts. - Mailing list pgsql-hackers

From Noah Misch
Subject Re: pgsql: Don't enter parallel mode when holding interrupts.
Date
Msg-id 20240920183931.f0.nmisch@google.com
Whole thread Raw
In response to Re: pgsql: Don't enter parallel mode when holding interrupts.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Sep 19, 2024 at 09:25:05AM -0400, Robert Haas wrote:
> On Wed, Sep 18, 2024 at 3:27 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Wed, 2024-09-18 at 02:58 +0000, Noah Misch wrote:
> > > Don't enter parallel mode when holding interrupts.
> > >
> > > Doing so caused the leader to hang in wait_event=ParallelFinish, which
> > > required an immediate shutdown to resolve.  Back-patch to v12 (all
> > > supported versions).
> > >
> > > Francesco Degrassi
> > >
> > > Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
> >
> > Does that warrant mention on this page?
> > https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html
> 
> IMHO, no. This seems too low-level and too odd to mention.

Agreed.  If I were documenting it, I would document it with the material for
writing opclasses.  It's probably too esoteric to document even there.

> TBH, I'm kind of surprised to learn that it's possible to start
> executing a query while holding an LWLock. I see Tom is expressing
> some doubts on the original thread, too. I wonder if we should instead
> be erroring out if an LWLock is held at the start of query execution
> -- or even earlier, like when we try to call a plpgsql function while
> holding one. Leaving parallel query aside, what would prevent us from
> attempting to reacquire the exact same LWLock that we already hold and
> self-deadlocking? Or attempting to acquire some other LWLock and
> deadlocking that way? I don't really feel like this is a parallel
> query problem. I don't think we should be trying to run any
> user-defined code while holding an LWLock, unless that code is written
> in C (or C++, Rust, etc.). Trying to run procedural code at that point
> doesn't seem reasonable.

Nothing prevents those lwlock deadlocks.  If you think it's worth breaking the
things folks use today (see original thread) in order to prevent that, please
do share that on the original thread.  I'm fine either way.  I think given
infinite resources across both postgresql.org and all extension maintainers, I
would do what you're thinking in v18 while in back branches, I would change
"erroring out" to "warn when assertions are enabled".  I also think it's a
low-priority bug, given the only known ways to reach it are C code or a custom
opclass.  Since resources aren't infinite, I'm inclined toward one of (a) stop
here or (b) all branches "warn when assertions are enabled" and maybe block
the plancache route discussed on the original thread.



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: FullTransactionIdAdvance question
Next
From: Nathan Bossart
Date:
Subject: Re: pg_checksums: Reorder headers in alphabetical order