Re: [bug?] Missed parallel safety checks, and wrong parallel safety - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date
Msg-id CAA4eK1LnZMZOM1uUFzH-RXdvkZLgTAuzwcdcasy7VfmD6=1bwQ@mail.gmail.com
Whole thread Raw
In response to Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Dilip Kumar <dilipbalaut@gmail.com>)
RE: [bug?] Missed parallel safety checks, and wrong parallel safety  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Mon, Jul 26, 2021 at 8:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Jul 24, 2021 at 5:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > I think for the consistency argument how about allowing users to
> > specify a parallel-safety option for both partitioned and
> > non-partitioned relations but for non-partitioned relations if users
> > didn't specify, it would be computed automatically? If the user has
> > specified parallel-safety option for non-partitioned relation then we
> > would consider that instead of computing the value by ourselves.
>
> Having the option for both partitioned and non-partitioned tables
> doesn't seem like the worst idea ever, but I am also not entirely sure
> that I understand the point.
>

Consider below ways to allow the user to specify the parallel-safety option:

(a)
CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..

OR

(b)
CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)

The point was what should we do if the user specifies the option for a
non-partitioned table. Do we just ignore it or give an error that this
is not a valid syntax/option when used with non-partitioned tables? I
find it slightly odd that this option works for partitioned tables but
gives an error for non-partitioned tables but maybe we can document
it.

With the above syntax, even if the user doesn't specify the
parallelism option for non-partitioned relations, we will determine it
automatically. Now, in some situations, users might want to force
parallelism even when we wouldn't have chosen it automatically. It is
possible that she might face an error due to some parallel-unsafe
function but OTOH, she might have ensured that it is safe to choose
parallelism in her particular case.

> > Another reason for hesitation to do automatically for non-partitioned
> > relations was the new invalidation which will invalidate the cached
> > parallel-safety for all relations in relcache for a particular
> > database. As mentioned by Hou-San [1], it seems we need to do this
> > whenever any function's parallel-safety is changed. OTOH, changing
> > parallel-safety for a function is probably not that often to matter in
> > practice which is why I think you seem to be fine with this idea.
>
> Right. I think it should be quite rare, and invalidation events are
> also not crazy expensive. We can test what the worst case is, but if
> you have to sit there and run ALTER FUNCTION in a tight loop to see a
> measurable performance impact, it's not a real problem. There may be a
> code complexity argument against trying to figure it out
> automatically, perhaps, but I don't think there's a big performance
> issue.
>

True, there could be some code complexity but I think we can see once
the patch is ready for review.

> What bothers me is that if this is something people have to set
> manually then many people won't and will not get the benefit of the
> feature. And some of them will also set it incorrectly and have
> problems. So I am in favor of trying to determine it automatically
> where possible, to make it easy for people. However, other people may
> feel differently, and I'm not trying to say they're necessarily wrong.
> I'm just telling you what I think.
>

Thanks for all your suggestions and feedback.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Slim down integer formatting
Next
From: Ronan Dunklau
Date:
Subject: Re: ORDER BY pushdowns seem broken in postgres_fdw