Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
Date
Msg-id Zejw-Qasf7b5Lxuq@paquier.xyz
Whole thread Raw
In response to Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build
List pgsql-bugs
On Wed, Mar 06, 2024 at 12:28:11PM -0500, Tom Lane wrote:
> As far as HEAD goes, I think we should revert this and then look
> to get [1] committed.
>
> I'm unsure what to do in the back branches,
> but given the lack of prior complaints I suspect we could get away
> with just reverting and leaving the issue unfixed.

Still the issue fixed here and the issue of the other thread are
different?  Relying on the relcache tricks the planner to spawn
workers for nothing because they would just fail if the internals of
the function are unsafe to run in the workers, and the function is
marked as such.  In some cases users like to use properties as hints
to give to the planner about what to use or not.  I'd like to believe
(and perhaps that's a cute thought) that functions are marked as
parallel-unsafe for a purpose, which is that they cannot run in
workers as they rely on a shared state.  Whether the internals of the
functions are implemented correctly is up to the user.  I mean, it's
perfectly possible to mark a function as immutable while its internals
are volatile, and I've seen people rely on that to push down clauses.

> (I wouldn't propose back-patching [1], it seems too risky.)

Agreed.  Just looked at it and the change of blockState can be
sometimes tricky to get right.  There are a bunch of internal
assumptions as far as I recall this area of the code.

> If we want to do something about the more generic issue that
> maybe const-folding will succeed in the leader process but
> fail in workers, then I think the way to do that is to adjust
> parallel index build so that the processed expression trees are
> passed to the workers from the leader. That is how it's done
> with regular query trees, and index build is evidently taking
> a not-very-safe shortcut by not doing it like that.  But I'm not
> excited enough about the mostly-theoretical hazard to put any
> work into that myself.

You mean in the serialization and deserialization logic used when
workers are spawned and then let the workers complain when they try to
use such a function in a new path?  That would be backpatchable, but
feels rather a bit invasive.

> Another line of thought is whether we should just forbid any
> parallel-unsafe functions in index expressions.  If we continue
> to allow them, I think we have to reject parallelizing any
> query or DDL operation that touches the associated table at all,
> because there's too much risk of a worker trying to evaluate
> such an expression somewhere along the line.  Again though,
> given the lack of complaints I'm not excited about imposing
> such a draconian policy.  (The fact that functions are marked
> parallel unsafe by default means that if we did, we'd almost
> certainly break cases that are working fine for users today.)

Neither am I to restrict that moving forward.  That could hurt
existing use cases that were allowed, still were able to work.

Saying that, seeing the lack of complaints and the fact that you're
not comfortable with that, I have no objections to revert the change
on all the branches.  Will do so in a bit.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
Next
From: Tom Lane
Date:
Subject: Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build