Thread: [bug?] Missed parallel safety checks, and wrong parallel safety

[bug?] Missed parallel safety checks, and wrong parallel safety

From

"tsunakawa.takay@fujitsu.com"

Date:

20 April 2021, 08:52:46

Hello,

I think we've found a few existing problems with handling the parallel safety of functions while doing an experiment.
CouldI hear your opinions on what we should do? I'd be willing to create and submit a patch to fix them.

The experiment is to add a parallel safety check in FunctionCallInvoke() and run the regression test with
force_parallel_mode=regress. The added check errors out with ereport(ERROR) when the about-to-be-called function is
parallelunsafe and the process is currently in parallel mode. 6 test cases failed because the following
parallel-unsafefunctions were called:

dsnowball_init
balkifnull
int44out
text_w_default_out
widget_out

The first function is created in src/backend/snowball/snowball_create.sql for full text search. The remaining
functionsare created during the regression test run.

The relevant issues follow.

(1)
All the above functions are actually parallel safe looking at their implementations. It seems that their CREATE
FUNCTIONstatements are just missing PARALLEL SAFE specifications, so I think I'll add them. dsnowball_lexize() may
alsobe parallel safe.

(2)
I'm afraid the above phenomenon reveals that postgres overlooks parallel safety checks in some places. Specifically,
wenoticed the following:

* User-defined aggregate
CREATE AGGREGATE allows to specify parallel safety of the aggregate itself and the planner checks it, but the support
functionof the aggregate is not checked. OTOH, the document clearly says:

https://www.postgresql.org/docs/devel/xaggr.html

"Worth noting also is that for an aggregate to be executed in parallel, the aggregate itself must be marked PARALLEL
SAFE.The parallel-safety markings on its support functions are not consulted."

https://www.postgresql.org/docs/devel/sql-createaggregate.html

"An aggregate will not be considered for parallelization if it is marked PARALLEL UNSAFE (which is the default!) or
PARALLELRESTRICTED. Note that the parallel-safety markings of the aggregate's support functions are not consulted by
theplanner, only the marking of the aggregate itself."

Can we check the parallel safety of aggregate support functions during statement execution and error out? Is there any
reasonnot to do so?

* User-defined data type
The input, output, send,receive, and other functions of a UDT are not checked for parallel safety. Is there any good
reasonto not check them other than the concern about performance?

* Functions for full text search
Should CREATE TEXT SEARCH TEMPLATE ensure that the functions are parallel safe? (Those functions could be changed to
parallelunsafe later with ALTER FUNCTION, though.)

(3) Built-in UDFs are not checked for parallel safety
The functions defined in fmgr_builtins[], which are derived from pg_proc.dat, are not checked. Most of them are marked
parallelsafe, but some are paralel unsaferestricted.

Besides, changing their parallel safety with ALTER FUNCTION PARALLEL does not affect the selection of query plan. This
isbecause fmgr_builtins[] does not have a member for parallel safety.

Should we add a member for parallel safety in fmgr_builtins[], and disallow ALTER FUNCTION to change the parallel
safetyof builtin UDFs?

Regards
Takayuki Tsunakawa

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Bharath Rupireddy

Date:

20 April 2021, 09:36:30

On Tue, Apr 20, 2021 at 2:23 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> (2)
> I'm afraid the above phenomenon reveals that postgres overlooks parallel safety checks in some places.  Specifically,
wenoticed the following: 
>
> * User-defined aggregate
> CREATE AGGREGATE allows to specify parallel safety of the aggregate itself and the planner checks it, but the support
functionof the aggregate is not checked.  OTOH, the document clearly says: 
>
> https://www.postgresql.org/docs/devel/xaggr.html
>
> "Worth noting also is that for an aggregate to be executed in parallel, the aggregate itself must be marked PARALLEL
SAFE.The parallel-safety markings on its support functions are not consulted." 
>
> https://www.postgresql.org/docs/devel/sql-createaggregate.html
>
> "An aggregate will not be considered for parallelization if it is marked PARALLEL UNSAFE (which is the default!) or
PARALLELRESTRICTED. Note that the parallel-safety markings of the aggregate's support functions are not consulted by
theplanner, only the marking of the aggregate itself." 

IMO, the reason for not checking the parallel safety of the support
functions is that the functions themselves can have whole lot of other
functions (can be nested as well) which might be quite hard to check
at the planning time. That is why the job of marking an aggregate as
parallel safe is best left to the user. They have to mark the aggreage
parallel unsafe if at least one support function is parallel unsafe,
otherwise parallel safe.

> Can we check the parallel safety of aggregate support functions during statement execution and error out?  Is there
anyreason not to do so? 

And if we were to do above, within the function execution API, we need
to know where the function got called from(?). It is best left to the
user to decide whether a function/aggregate is parallel safe or not.
This is the main reason we have declarative constructs like parallel
safe/unsafe/restricted.

For core functions, we definitely should properly mark parallel
safe/restricted/unsafe tags wherever possible.

Please correct me If I miss something.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Tom Lane

Date:

20 April 2021, 14:49:43

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> On Tue, Apr 20, 2021 at 2:23 PM tsunakawa.takay@fujitsu.com
> <tsunakawa.takay@fujitsu.com> wrote:
>> https://www.postgresql.org/docs/devel/xaggr.html
>>
>> "Worth noting also is that for an aggregate to be executed in parallel, the aggregate itself must be marked PARALLEL
SAFE.The parallel-safety markings on its support functions are not consulted." 

> IMO, the reason for not checking the parallel safety of the support
> functions is that the functions themselves can have whole lot of other
> functions (can be nested as well) which might be quite hard to check
> at the planning time. That is why the job of marking an aggregate as
> parallel safe is best left to the user.

Yes.  I think the documentation is perfectly clear that this is
intentional; I don't see a need to change it.

>> Should we add a member for parallel safety in fmgr_builtins[], and disallow ALTER FUNCTION to change the parallel
safetyof builtin UDFs? 

No.  You'd have to be superuser anyway to do that, and we're not in the
habit of trying to put training wheels on superusers.

Don't have an opinion about the other points yet.

            regards, tom lane

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"tsunakawa.takay@fujitsu.com"

Date:

21 April 2021, 01:56:11

From: Tom Lane <tgl@sss.pgh.pa.us>
> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> > IMO, the reason for not checking the parallel safety of the support
> > functions is that the functions themselves can have whole lot of other
> > functions (can be nested as well) which might be quite hard to check
> > at the planning time. That is why the job of marking an aggregate as
> > parallel safe is best left to the user.
>
> Yes.  I think the documentation is perfectly clear that this is
> intentional; I don't see a need to change it.

OK, that's what I expected.  I understood from this that the Postgres's stance toward parallel safety is that Postgres
doesits best effort to check parallel safety (as far as it doesn't hurt performance much, and perhaps the core code
doesn'tget very complex), and the user should be responsible for the actual parallel safety of ancillary objects (in
thiscase, support functions for an aggregate) of the target object that he/she marked as parallel safe. 

> >> Should we add a member for parallel safety in fmgr_builtins[], and disallow
> ALTER FUNCTION to change the parallel safety of builtin UDFs?
>
> No.  You'd have to be superuser anyway to do that, and we're not in the
> habit of trying to put training wheels on superusers.

Understood.  However, we may add the parallel safety member in fmgr_builtins[] in another thread for parallel INSERT
SELECT. I'd appreciate your comment on this if you see any concern. 

> Don't have an opinion about the other points yet.

I'd like to have your comments on them, too.  But I understand you must be so busy at least until the beta release of
PG14. 

Regards
Takayuki Tsunakawa

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Tom Lane

Date:

21 April 2021, 02:22:40

"tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> writes:
> From: Tom Lane <tgl@sss.pgh.pa.us>
>> No.  You'd have to be superuser anyway to do that, and we're not in the
>> habit of trying to put training wheels on superusers.

> Understood.  However, we may add the parallel safety member in fmgr_builtins[] in another thread for parallel INSERT
SELECT. I'd appreciate your comment on this if you see any concern. 

[ raised eyebrow... ]  I find it very hard to understand why that would
be necessary, or even a good idea.  Not least because there's no spare
room there; you'd have to incur a substantial enlargement of the
array to add another flag.  But also, that would indeed lock down
the value of the parallel-safety flag, and that seems like a fairly
bad idea.

            regards, tom lane

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"tsunakawa.takay@fujitsu.com"

Date:

21 April 2021, 02:41:58

From: Tom Lane <tgl@sss.pgh.pa.us>
> [ raised eyebrow... ]  I find it very hard to understand why that would
> be necessary, or even a good idea.  Not least because there's no spare
> room there; you'd have to incur a substantial enlargement of the
> array to add another flag.  But also, that would indeed lock down
> the value of the parallel-safety flag, and that seems like a fairly
> bad idea.

You're right, FmgrBuiltins is already fully packed (24 bytes on 64-bit machines).  Enlarging the frequently accessed
fmgr_builtinsarray may wreak unexpectedly large adverse effect on performance. 

I wanted to check the parallel safety of functions, which various objects (data type, index, trigger, etc.) come down
to,in FunctionCallInvoke() and other few places.  But maybe we skip the check for built-in functions.  That's a matter
ofwhere we draw a line between where we check and where we don't. 


Regards
Takayuki Tsunakawa

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"houzj.fnst@fujitsu.com"

Date:

21 April 2021, 08:09:25

> I think we've found a few existing problems with handling the parallel safety of
> functions while doing an experiment.  Could I hear your opinions on what we
> should do?  I'd be willing to create and submit a patch to fix them.
>
> The experiment is to add a parallel safety check in FunctionCallInvoke() and run
> the regression test with force_parallel_mode=regress.  The added check
> errors out with ereport(ERROR) when the about-to-be-called function is
> parallel unsafe and the process is currently in parallel mode.  6 test cases failed
> because the following parallel-unsafe functions were called:
>
>     dsnowball_init
>     balkifnull
>     int44out
>     text_w_default_out
>     widget_out
>
> The first function is created in src/backend/snowball/snowball_create.sql for
> full text search.  The remaining functions are created during the regression
> test run.
>
> (1)
> All the above functions are actually parallel safe looking at their
> implementations.  It seems that their CREATE FUNCTION statements are just
> missing PARALLEL SAFE specifications, so I think I'll add them.
> dsnowball_lexize() may also be parallel safe.

I agree that it's better to mark the function with correct parallel safety lable.
Especially for the above functions which will be executed in parallel mode.
It will be friendly to developer and user who is working on something related to parallel test.

So, I attached the patch to mark the above functions parallel safe.

Best regards,
houzj

Attachment

0001-fix-testcase-with-wrong-parallel-safety-flag.patch

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

21 April 2021, 10:09:07

On Wed, Apr 21, 2021 at 8:12 AM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
> From: Tom Lane <tgl@sss.pgh.pa.us>
> > [ raised eyebrow... ]  I find it very hard to understand why that would
> > be necessary, or even a good idea.  Not least because there's no spare
> > room there; you'd have to incur a substantial enlargement of the
> > array to add another flag.  But also, that would indeed lock down
> > the value of the parallel-safety flag, and that seems like a fairly
> > bad idea.
>
> You're right, FmgrBuiltins is already fully packed (24 bytes on 64-bit machines).  Enlarging the frequently accessed
fmgr_builtinsarray may wreak unexpectedly large adverse effect on performance. 
>
> I wanted to check the parallel safety of functions, which various objects (data type, index, trigger, etc.) come down
to,in FunctionCallInvoke() and other few places.  But maybe we skip the check for built-in functions.  That's a matter
ofwhere we draw a line between where we check and where we don't. 
>

IIUC, the idea here is to check for parallel safety of functions at
someplace in the code during function invocation so that if we execute
any parallel unsafe/restricted function via parallel worker then we
error out. If so, isn't it possible to deal with built-in and
non-built-in functions in the same way?

I think we want to have some safety checks for functions as we have
for transaction id in AssignTransactionId(), command id in
CommandCounterIncrement(), for write operations in
heap_prepare_insert(), etc. Is that correct?

--
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Tom Lane

Date:

21 April 2021, 13:34:46

Amit Kapila <amit.kapila16@gmail.com> writes:
> On Wed, Apr 21, 2021 at 8:12 AM tsunakawa.takay@fujitsu.com
> <tsunakawa.takay@fujitsu.com> wrote:
>> From: Tom Lane <tgl@sss.pgh.pa.us>
>>> [ raised eyebrow... ]  I find it very hard to understand why that would
>>> be necessary, or even a good idea.

> IIUC, the idea here is to check for parallel safety of functions at
> someplace in the code during function invocation so that if we execute
> any parallel unsafe/restricted function via parallel worker then we
> error out. If so, isn't it possible to deal with built-in and
> non-built-in functions in the same way?

Yeah, one of the reasons I doubt this is a great idea is that you'd
still have to fetch the pg_proc row for non-built-in functions.

The obvious place to install such a check is fmgr_info(), which is
fetching said row anyway for other purposes, so it's really hard to
see how adding anything to FmgrBuiltin is going to help.

            regards, tom lane

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"tsunakawa.takay@fujitsu.com"

Date:

22 April 2021, 06:40:19

From: Tom Lane <tgl@sss.pgh.pa.us>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > IIUC, the idea here is to check for parallel safety of functions at
> > someplace in the code during function invocation so that if we execute
> > any parallel unsafe/restricted function via parallel worker then we
> > error out. If so, isn't it possible to deal with built-in and
> > non-built-in functions in the same way?
>
> Yeah, one of the reasons I doubt this is a great idea is that you'd
> still have to fetch the pg_proc row for non-built-in functions.
>
> The obvious place to install such a check is fmgr_info(), which is
> fetching said row anyway for other purposes, so it's really hard to
> see how adding anything to FmgrBuiltin is going to help.

Thank you, fmgr_info() looks like the best place to do the parallel safety check.  Having a quick look at its callers,
Ididn't find any concerning place (of course, we can't be relieved until the regression test succeeds.)  Also, with
fmgr_info(),we don't have to find other places to add the check to deal with functions calls in execExpr.c and
execExprInterp.c. This is beautiful. 

But the current fmgr_info() does not check the parallel safety of builtin functions.  It does not have information to
dothat.  There are two options.  Which do you think is better?  I think 2. 

1) fmgr_info() reads pg_proc like for non-builtin functions
This ruins the effort for the fast path for builtin functions.  I can't imagine how large the adverse impact on
performancewould be, but I'm worried. 

The benefit is that ALTER FUNCTION on builtin functions takes effect.  But such operations are nonsensical, so I don't
thinkwe want to gain such a benefit. 

2) Gen_fmgrtab.pl adds a member for proparallel in FmgrBuiltin
But we don't want to enlarge FmgrBuiltin struct.  So, change the existing bool members strict and and retset into one
memberof type char, and represent the original values with some bit flags.  Then we use that member for proparallel as
well. (As a result, one byte is left for future use.) 

I think we'll try 2).  I'd be grateful if you could point out anything I need to be careful to.

Regards
Takayuki Tsunakawa

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"tsunakawa.takay@fujitsu.com"

Date:

22 April 2021, 07:27:28

From: Hou, Zhijie/侯 志杰 <houzj.fnst@fujitsu.com>
> I agree that it's better to mark the function with correct parallel safety lable.
> Especially for the above functions which will be executed in parallel mode.
> It will be friendly to developer and user who is working on something related to
> parallel test.
>
> So, I attached the patch to mark the above functions parallel safe.

Thank you, the patch looks good.  Please register it with the next CF if not yet.


Regards
Takayuki Tsunakawa

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"houzj.fnst@fujitsu.com"

Date:

22 April 2021, 09:08:49

> Thank you, fmgr_info() looks like the best place to do the parallel safety check.
> Having a quick look at its callers, I didn't find any concerning place (of course,
> we can't be relieved until the regression test succeeds.)  Also, with fmgr_info(),
> we don't have to find other places to add the check to deal with functions calls
> in execExpr.c and execExprInterp.c.  This is beautiful.
>
> But the current fmgr_info() does not check the parallel safety of builtin
> functions.  It does not have information to do that.  There are two options.
> Which do you think is better?  I think 2.
>
> 1) fmgr_info() reads pg_proc like for non-builtin functions This ruins the effort
> for the fast path for builtin functions.  I can't imagine how large the adverse
> impact on performance would be, but I'm worried.

For approach 1): I think it could result in infinite recursion.

For example:
If we first access one built-in function A which have not been cached,
it need access the pg_proc, When accessing the pg_proc, it internally still need some built-in function B to scan.
At this time, if B is not cached , it still need to fetch function B's parallel flag by accessing the
pg_proc.proparallel.
Then it could result in infinite recursion.

So, I think we can consider the approach 2)

Best regards,
houzj

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"tsunakawa.takay@fujitsu.com"

Date:

23 April 2021, 01:39:08

From: Hou, Zhijie/侯 志杰 <houzj.fnst@fujitsu.com>
> For approach 1): I think it could result in infinite recursion.
>
> For example:
> If we first access one built-in function A which have not been cached,
> it need access the pg_proc, When accessing the pg_proc, it internally still need
> some built-in function B to scan.
> At this time, if B is not cached , it still need to fetch function B's parallel flag by
> accessing the pg_proc.proparallel.
> Then it could result in infinite recursion.
>
> So, I think we can consider the approach 2)

Hmm, that makes sense.  That's a problem structure similar to that of relcache.  Only one choice is left already,
unlessthere's another better idea. 



Regards
Takayuki Tsunakawa

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Greg Nancarrow

Date:

23 April 2021, 07:37:20

On Wed, Apr 21, 2021 at 12:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> writes:
> > From: Tom Lane <tgl@sss.pgh.pa.us>
> >> No.  You'd have to be superuser anyway to do that, and we're not in the
> >> habit of trying to put training wheels on superusers.
>
> > Understood.  However, we may add the parallel safety member in fmgr_builtins[] in another thread for parallel
INSERTSELECT.  I'd appreciate your comment on this if you see any concern.
 
>
> [ raised eyebrow... ]  I find it very hard to understand why that would
> be necessary, or even a good idea.  Not least because there's no spare
> room there; you'd have to incur a substantial enlargement of the
> array to add another flag.  But also, that would indeed lock down
> the value of the parallel-safety flag, and that seems like a fairly
> bad idea.
>

I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
that would "lock down the value" of the strict flag, wouldn't it?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

23 April 2021, 10:02:07

On Wed, Apr 21, 2021 at 7:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > On Wed, Apr 21, 2021 at 8:12 AM tsunakawa.takay@fujitsu.com
> > <tsunakawa.takay@fujitsu.com> wrote:
> >> From: Tom Lane <tgl@sss.pgh.pa.us>
> >>> [ raised eyebrow... ]  I find it very hard to understand why that would
> >>> be necessary, or even a good idea.
>
> > IIUC, the idea here is to check for parallel safety of functions at
> > someplace in the code during function invocation so that if we execute
> > any parallel unsafe/restricted function via parallel worker then we
> > error out. If so, isn't it possible to deal with built-in and
> > non-built-in functions in the same way?
>
> Yeah, one of the reasons I doubt this is a great idea is that you'd
> still have to fetch the pg_proc row for non-built-in functions.
>

So, are you suggesting that we should fetch the pg_proc row for
built-in functions as well for this purpose? If not, then how to
identify parallel safety of built-in functions in fmgr_info()?

Another idea could be that we check parallel safety of built-in
functions based on some static information. As we know the func_ids of
non-parallel-safe built-in functions, we can have a function
fmgr_builtin_parallel_safe() which check if the func_id is not one
among the predefined func_ids of non-parallel-safe built-in functions,
it returns true, otherwise, false. Then, we can call this new function
in fmgr_info for built-in functions.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Tom Lane

Date:

23 April 2021, 13:15:02

Greg Nancarrow <gregn4422@gmail.com> writes:
> I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
> that would "lock down the value" of the strict flag, wouldn't it?

It does, but that's much more directly a property of the function's
C code than parallel-safety is.

            regards, tom lane

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Robert Haas

Date:

23 April 2021, 15:50:17

On Fri, Apr 23, 2021 at 9:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Nancarrow <gregn4422@gmail.com> writes:
> > I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
> > that would "lock down the value" of the strict flag, wouldn't it?
>
> It does, but that's much more directly a property of the function's
> C code than parallel-safety is.

I'm not sure I agree with that, but I think having the "strict" flag
in FmgrBuiltin isn't that nice either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Tom Lane

Date:

23 April 2021, 15:56:35

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 23, 2021 at 9:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Greg Nancarrow <gregn4422@gmail.com> writes:
>>> I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
>>> that would "lock down the value" of the strict flag, wouldn't it?

>> It does, but that's much more directly a property of the function's
>> C code than parallel-safety is.

> I'm not sure I agree with that, but I think having the "strict" flag
> in FmgrBuiltin isn't that nice either.

Yeah, if we could readily do without it, we probably would.  But the
function call mechanism itself is responsible for implementing strictness,
so it *has* to have that flag available.

            regards, tom lane

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

24 April 2021, 02:53:25

On Fri, Apr 23, 2021 at 6:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Greg Nancarrow <gregn4422@gmail.com> writes:
> > I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
> > that would "lock down the value" of the strict flag, wouldn't it?
>
> It does, but that's much more directly a property of the function's
> C code than parallel-safety is.
>

Isn't parallel safety also the C code property? I mean unless someone
changes the built-in function code, changing that property would be
dangerous. The other thing is even if a user is allowed to change one
function's property, how will they know which other functions are
called by that function and whether they are parallel-safe or not. For
example, say if the user wants to change the parallel safe property of
a built-in function brin_summarize_new_values, unless she changes its
code and the functions called by it like brin_summarize_range, it
would be dangerous. So, isn't it better to disallow changing parallel
safety for built-in functions?

Also, if the strict property of built-in functions is fixed
internally, why we allow users to change it and is that of any help?

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Greg Nancarrow

Date:

27 April 2021, 04:08:59

On Sat, Apr 24, 2021 at 12:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 23, 2021 at 6:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Greg Nancarrow <gregn4422@gmail.com> writes:
> > > I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
> > > that would "lock down the value" of the strict flag, wouldn't it?
> >
> > It does, but that's much more directly a property of the function's
> > C code than parallel-safety is.
> >
>
> Isn't parallel safety also the C code property? I mean unless someone
> changes the built-in function code, changing that property would be
> dangerous. The other thing is even if a user is allowed to change one
> function's property, how will they know which other functions are
> called by that function and whether they are parallel-safe or not. For
> example, say if the user wants to change the parallel safe property of
> a built-in function brin_summarize_new_values, unless she changes its
> code and the functions called by it like brin_summarize_range, it
> would be dangerous. So, isn't it better to disallow changing parallel
> safety for built-in functions?
>
> Also, if the strict property of built-in functions is fixed
> internally, why we allow users to change it and is that of any help?
>

Yes, I'd like to know too.
I think it would make more sense to disallow changing properties like
strict/parallel-safety on built-in functions.
Also, with sufficient privileges, a built-in function can be
redefined, yet the original function (whose info is cached in
FmgrBuiltins[], from build-time) is always invoked, not the
newly-defined version.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"houzj.fnst@fujitsu.com"

Date:

29 April 2021, 01:42:18

> >>> I'm curious. The FmgrBuiltin struct includes the "strict" flag, so
> >>> that would "lock down the value" of the strict flag, wouldn't it?
>
> >> It does, but that's much more directly a property of the function's C
> >> code than parallel-safety is.
>
> > I'm not sure I agree with that, but I think having the "strict" flag
> > in FmgrBuiltin isn't that nice either.
>
> Yeah, if we could readily do without it, we probably would.  But the function
> call mechanism itself is responsible for implementing strictness, so it *has* to
> have that flag available.

So, If we do not want to lock down the parallel safety of built-in functions.
It seems we can try to fetch the proparallel from pg_proc for built-in function
in fmgr_info_cxt_security too. To avoid recursive safety check when fetching
proparallel from pg_proc, we can add a Global variable to mark is it in a recursive state.
And we skip safety check in a recursive state, In this approach, parallel safety
will not be locked, and there are no new members in FmgrBuiltin.

Attaching the patch about this approach [0001-approach-1].
Thoughts ?

I also attached another approach patch [0001-approach-2] about adding
parallel safety in FmgrBuiltin, because this approach seems faster and
we can combine some bool member into a bitflag to avoid enlarging the
FmgrBuiltin array, though this approach will lock down the parallel safety
of built-in function.

Best regards,
houzj

> > Sometimes this is actually quite useful. You might know that, while
> > the function is in general volatile, it is immutable in the particular
> > way that you are using it. Or, perhaps, you are using the volatile
> > function incidentally and it doesn't affect the output of your
> > function at all. Or, maybe you actually want to build an index that
> > might break, and then it's up to you to rebuild the index if and when
> > that is required. Users do this kind of thing all the time, I think,
> > and would be unhappy if we started checking it more rigorously than we
> > do today.
> >
> > Now, I don't see why the same idea can't or shouldn't apply to
> > parallel-safety. If you call a parallel-unsafe function in a parallel
> > context, it's pretty likely that you are going to get an error, and so
> > you might not want to do it. If the function is written in C, it could
> > even cause horrible things to happen so that you crash the whole
> > backend or something, but I tried to set things up so that for
> > built-in functions you'll just get an error. But on the other hand,
> > maybe the parallel-unsafe function you are calling is not
> > parallel-unsafe in all cases. If you want to create a wrapper function
> > that is labelled parallel-safe and try to make that it only calls the
> > parallel-unsafe function in the cases where there's no safety problem,
> > that's up to you!
> >
> 
> I think it is difficult to say for what purpose parallel-unsafe function got called in
> parallel context so if we give an error in cases where otherwise it could lead to
> a crash or caused other horrible things, users will probably appreciate us.
> OTOH, if the parallel-safety labeling is wrong (parallel-safe function is marked
> parallel-unsafe) and we gave an error in such a case, the user can always change
> the parallel-safety attribute by using Alter Function.
> Now, if adding such a check is costly or needs some major re-design then
> probably it might not be worth whereas I don't think that is the case for
> non-built-in function invocation.

Temporarily, Just in case someone want to take a look at the patch for the safety check.
I splited the patch into 0001(parallel safety check for user define function), 0003(parallel safety check for builtin
function)
and the fix for testcases.

IMO, With such a check to give an error when detecting parallel unsafe function in parallel mode,
it will be easier for users to discover potential threats(parallel unsafe function) in parallel mode.

I think users is likely to invoke parallel unsafe function inner a parallel safe function unintentionally.
Such a check can help they detect the problem easier.

Although, the strict check limits some usages(intentionally wrapper function) like Robert-san said.
To mitigate the effect of the limit,  I was thinking can we do the safety check conditionally, such as only check the
topfunction invoke and/or
 
introduce a guc option to control whether do the strict parallel safety check?  Thoughts ?

Best regards,
houzj

On Wed, Jun 23, 2021 at 8:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jun 16, 2021 at 6:10 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, June 15, 2021 10:01 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > Now, maybe it could be done, and I think that's worth a little more thought. For
> > example, perhaps whenever we invalidate a relation, we could also somehow
> > send some new, special kind of invalidation for its parent saying, essentially,
> > "hey, one of your children has changed in a way you might care about." But
> > that's not good enough, because it only goes up one level. The grandparent
> > would still be unaware that a change it potentially cares about has occurred
> > someplace down in the partitioning hierarchy. That seems hard to patch up,
> > again because of the locking rules. The child can know the OID of its parent
> > without locking the parent, but it can't know the OID of its grandparent without
> > locking its parent. Walking up the whole partitioning hierarchy might be an
> > issue for a number of reasons, including possible deadlocks, and possible race
> > conditions where we don't emit all of the right invalidations in the face of
> > concurrent changes. So I don't quite see a way around this part of the problem,
> > but I may well be missing something. In fact I hope I am missing something,
> > because solving this problem would be really nice.
>
> For partition, I think postgres already have the logic about recursively finding
> the parent table[1]. Can we copy that logic to send serval invalid messages that
> invalidate the parent and grandparent... relcache if change a partition's parallel safety ?
> Although, it means we need more lock(on its parents) when the parallel safety
> changed, but it seems it's not a frequent scenario and looks acceptable.
>
> [1] In generate_partition_qual()
> parentrelid = get_partition_parent(RelationGetRelid(rel), true);
> parent = relation_open(parentrelid, AccessShareLock);
> ...
> /* Add the parent's quals to the list (if any) */
> if (parent->rd_rel->relispartition)
> result = list_concat(generate_partition_qual(parent), my_qual);
>

As shown by me in another email [1], such a coding pattern can lead to
deadlock. It is because in some DDL operations we walk the partition
hierarchy from top to down and if we walk from bottom to upwards, then
that can lead to deadlock. I think this is a dangerous coding pattern
and we shouldn't try to replicate it.

[1] - https://www.postgresql.org/message-id/CAA4eK1LsFpjK5gL%2B0HEvoqB2DJVOi19vGAWbZBEx8fACOi5%2B_A%40mail.gmail.com

--
With Regards,
Amit Kapila.

Hi,

How about walking the partition hierarchy bottom up, recording the parents but not taking the locks.

Once top-most parent is found, take the locks in reverse order (top down) ?

Cheers

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Greg Nancarrow

Date:

24 June 2021, 03:55:14

On Thu, Jun 24, 2021 at 1:38 PM Zhihong Yu <zyu@yugabyte.com> wrote:
>
> How about walking the partition hierarchy bottom up, recording the parents but not taking the locks.
> Once top-most parent is found, take the locks in reverse order (top down) ?
>

Is it safe to walk up the partition hierarchy (to record the parents
for the eventual locking in reverse order) without taking locks?

Regards,
Greg Nancarrow
Fujitsu Australia

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"houzj.fnst@fujitsu.com"

Date:

24 June 2021, 04:19:47

On Thursday, June 24, 2021 11:44 AM Zhihong Yu <zyu@yugabyte.com> wrote:
> Hi,
> How about walking the partition hierarchy bottom up, recording the parents but not taking the locks.
> Once top-most parent is found, take the locks in reverse order (top down) ?

IMO, When we directly INSERT INTO a partition, postgres already lock the partition
as the target table before execution which means we cannot postpone the lock
on partition to find the parent table.

Best regards,
houzj

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

28 June 2021, 09:51:34

On Mon, Jun 21, 2021 at 4:40 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> To be honest, I didn't find a cheap way to invalidate partitioned table's
> parallel safety automatically.
>

I also don't see the feasibility for doing parallelism checks for
partitioned tables both because it is expensive due to
traversing/locking all the partitions and then the invalidations are
difficult to handle due to deadlock hazards as discussed above.

Let me try to summarize the discussion so far and see if we can have
any better ideas than what we have discussed so far or we want to go
with one of the ideas discussed till now. I think we have broadly
discussed two approaches (a) to automatically decide whether
parallelism can be enabled for inserts, (b) provide an option to the
user to specify whether inserts can be parallelized on a relation.

For the first approach (a), we have evaluated both the partitioned and
non-partitioned relation cases. For non-partitioned relations, we can
compute the parallel-safety of relation during the planning and save
it in the relation cache entry. This is normally safe because we have
a lock on the relation and any change to the relation should raise an
invalidation which will lead to re-computation of parallel-safety
information for a relation. Now, there are cases where the
parallel-safety of some trigger function or a function used in index
expression can be changed by the user which won't register an
invalidation for a relation. To handle such cases, we can register a
new kind of invalidation only when a function's parallel-safety
information is changed. And every backend in the same database then
needs to re-evaluate the parallel-safety of every relation for which
it has cached a value. For partitioned relations, the similar idea
won't work because of multiple reasons (a) We need to traverse and
lock all the partitions to compute the parallel-safety of the root
relation which could be very expensive; (b) Whenever we invalidate a
particular partition, we need to invalidate its parent hierarchy as
well. We can't traverse the parent hierarchy without taking locks on
the parent table which can lead to deadlock. The alternative could be
that for partitioned relations we can rely on the user-specified
information about parallel-safety (like the approach-b mentioned in
the previous paragraph). We can additionally check the parallel safety
of partitions when we are trying to insert into a particular partition
and error out if we detect any parallel-unsafe clause and we are in
parallel-mode. So, in this case, we won't be completely relying on the
users. Users can either change the parallel safe option of the table
or remove/change the parallel-unsafe clause after an error.

For the second approach (b), we can provide an option to the user to
specify whether inserts (or other dml's) can be parallelized for a
relation. One of the ideas is to provide some options like below to
the user:
CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE };
ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };

This property is recorded in pg_class's relparallel column as 'u',
'r', or 's', just like pg_proc's proparallel. The default is UNSAFE.
Additionally, provide a function pg_get_parallel_safety(oid) using
which users can determine whether it is safe to enable parallelism.
Surely, after the user has checked with that function, one can add
some unsafe constraints to the table by altering the table but it will
still be an aid to enable parallelism on a relation.

The first approach (a) has an appeal because it would allow to
automatically parallelize inserts in many cases but might have some
overhead in some cases due to processing of relcache entries after the
parallel-safety of the relation is changed. The second approach (b)
has an appeal because of its consistent behavior for partitioned and
non-partitioned relations.

Among the above options, I would personally prefer (b) mainly because
of the consistent handling for partition and non-partition table cases
but I am fine with approach (a) as well if that is what other people
feel is better.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Greg Nancarrow

Date:

01 July 2021, 03:46:03

On Mon, Jun 28, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Among the above options, I would personally prefer (b) mainly because
> of the consistent handling for partition and non-partition table cases
> but I am fine with approach (a) as well if that is what other people
> feel is better.
>
> Thoughts?
>

I personally think "(b) provide an option to the user to specify
whether inserts can be parallelized on a relation" is the preferable
option.
There seems to be too many issues with the alternative of trying to
determine the parallel-safety of a partitioned table automatically.
I think (b) is the simplest and most consistent approach, working the
same way for all table types, and without the overhead of (a).
Also, I don't think (b) is difficult for the user. At worst, the user
can use the provided utility-functions at development-time to verify
the intended declared table parallel-safety.
I can't really see some mixture of (a) and (b) being acceptable.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Robert Haas

Date:

02 July 2021, 14:46:30

On Wed, Jun 30, 2021 at 11:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> I personally think "(b) provide an option to the user to specify
> whether inserts can be parallelized on a relation" is the preferable
> option.
> There seems to be too many issues with the alternative of trying to
> determine the parallel-safety of a partitioned table automatically.
> I think (b) is the simplest and most consistent approach, working the
> same way for all table types, and without the overhead of (a).
> Also, I don't think (b) is difficult for the user. At worst, the user
> can use the provided utility-functions at development-time to verify
> the intended declared table parallel-safety.
> I can't really see some mixture of (a) and (b) being acceptable.

Yeah, I'd like to have it be automatic, but I don't have a clear idea
how to make that work nicely. It's possible somebody (Tom?) can
suggest something that I'm overlooking, though.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Dilip Kumar

Date:

04 July 2021, 05:43:53

On Fri, Jul 2, 2021 at 8:16 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jun 30, 2021 at 11:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > I personally think "(b) provide an option to the user to specify
> > whether inserts can be parallelized on a relation" is the preferable
> > option.
> > There seems to be too many issues with the alternative of trying to
> > determine the parallel-safety of a partitioned table automatically.
> > I think (b) is the simplest and most consistent approach, working the
> > same way for all table types, and without the overhead of (a).
> > Also, I don't think (b) is difficult for the user. At worst, the user
> > can use the provided utility-functions at development-time to verify
> > the intended declared table parallel-safety.
> > I can't really see some mixture of (a) and (b) being acceptable.
>
> Yeah, I'd like to have it be automatic, but I don't have a clear idea
> how to make that work nicely. It's possible somebody (Tom?) can
> suggest something that I'm overlooking, though.

In general, for the non-partitioned table, where we don't have much
overhead of checking the parallel safety and invalidation is also not
a big problem so I am tempted to provide an automatic parallel safety
check.  This would enable parallelism for more cases wherever it is
suitable without user intervention.  OTOH, I understand that providing
automatic checking might be very costly if the number of partitions is
more.  Can't we provide some mid-way where the parallelism is enabled
by default for the normal table but for the partitioned table it is
disabled by default and the user has to set it safe for enabling
parallelism?  I agree that such behavior might sound a bit hackish.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"houzj.fnst@fujitsu.com"

Date:

06 July 2021, 01:42:20

On Sunday, July 4, 2021 1:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Fri, Jul 2, 2021 at 8:16 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Wed, Jun 30, 2021 at 11:46 PM Greg Nancarrow <gregn4422@gmail.com>
> wrote:
> > > I personally think "(b) provide an option to the user to specify
> > > whether inserts can be parallelized on a relation" is the preferable
> > > option.
> > > There seems to be too many issues with the alternative of trying to
> > > determine the parallel-safety of a partitioned table automatically.
> > > I think (b) is the simplest and most consistent approach, working
> > > the same way for all table types, and without the overhead of (a).
> > > Also, I don't think (b) is difficult for the user. At worst, the
> > > user can use the provided utility-functions at development-time to
> > > verify the intended declared table parallel-safety.
> > > I can't really see some mixture of (a) and (b) being acceptable.
> >
> > Yeah, I'd like to have it be automatic, but I don't have a clear idea
> > how to make that work nicely. It's possible somebody (Tom?) can
> > suggest something that I'm overlooking, though.
> 
> In general, for the non-partitioned table, where we don't have much overhead
> of checking the parallel safety and invalidation is also not a big problem so I am
> tempted to provide an automatic parallel safety check.  This would enable
> parallelism for more cases wherever it is suitable without user intervention.
> OTOH, I understand that providing automatic checking might be very costly if
> the number of partitions is more.  Can't we provide some mid-way where the
> parallelism is enabled by default for the normal table but for the partitioned
> table it is disabled by default and the user has to set it safe for enabling
> parallelism?  I agree that such behavior might sound a bit hackish.

About the invalidation for non-partitioned table, I think it still has a
problem: When a function's parallel safety changed, it's expensive to judge
whether the function is related to index or trigger or some table-related
objects by using pg_depend, because we can only do the judgement in each
backend when accept a invalidation message.  If we don't do that, it means
whenever a function's parallel safety changed, we invalidate every relation's
cached safety which looks not very nice to me.

So, I personally think "(b) provide an option to the user to specify whether
inserts can be parallelized on a relation" is the preferable option.

Best regards,
houzj

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Robert Haas

Date:

20 July 2021, 19:00:01

On Sun, Jul 4, 2021 at 1:44 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> In general, for the non-partitioned table, where we don't have much
> overhead of checking the parallel safety and invalidation is also not
> a big problem so I am tempted to provide an automatic parallel safety
> check.  This would enable parallelism for more cases wherever it is
> suitable without user intervention.  OTOH, I understand that providing
> automatic checking might be very costly if the number of partitions is
> more.  Can't we provide some mid-way where the parallelism is enabled
> by default for the normal table but for the partitioned table it is
> disabled by default and the user has to set it safe for enabling
> parallelism?  I agree that such behavior might sound a bit hackish.

I think that's basically the proposal that Amit and I have been discussing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

22 July 2021, 03:55:47

On Wed, Jul 21, 2021 at 12:30 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sun, Jul 4, 2021 at 1:44 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > In general, for the non-partitioned table, where we don't have much
> > overhead of checking the parallel safety and invalidation is also not
> > a big problem so I am tempted to provide an automatic parallel safety
> > check.  This would enable parallelism for more cases wherever it is
> > suitable without user intervention.  OTOH, I understand that providing
> > automatic checking might be very costly if the number of partitions is
> > more.  Can't we provide some mid-way where the parallelism is enabled
> > by default for the normal table but for the partitioned table it is
> > disabled by default and the user has to set it safe for enabling
> > parallelism?  I agree that such behavior might sound a bit hackish.
>
> I think that's basically the proposal that Amit and I have been discussing.
>

I see here we have a mix of opinions from various people. Dilip seems
to be favoring the approach where we provide some option to the user
for partitioned tables and automatic behavior for non-partitioned
tables but he also seems to have mild concerns about this behavior.
OTOH, Greg and Hou-San seem to favor an approach where we can provide
an option to the user for both partitioned and non-partitioned tables.
I am also in favor of providing an option to the user for the sake of
consistency in behavior and not trying to introduce a special kind of
invalidation as it doesn't serve the purpose for partitioned tables.
Robert seems to be in favor of automatic behavior but it is not very
clear to me if he is fine with dealing differently for partitioned and
non-partitioned relations. Robert, can you please provide your opinion
on what do you think is the best way to move forward here?

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Robert Haas

Date:

23 July 2021, 13:25:22

On Wed, Jul 21, 2021 at 11:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> I see here we have a mix of opinions from various people. Dilip seems
> to be favoring the approach where we provide some option to the user
> for partitioned tables and automatic behavior for non-partitioned
> tables but he also seems to have mild concerns about this behavior.
> OTOH, Greg and Hou-San seem to favor an approach where we can provide
> an option to the user for both partitioned and non-partitioned tables.
> I am also in favor of providing an option to the user for the sake of
> consistency in behavior and not trying to introduce a special kind of
> invalidation as it doesn't serve the purpose for partitioned tables.
> Robert seems to be in favor of automatic behavior but it is not very
> clear to me if he is fine with dealing differently for partitioned and
> non-partitioned relations. Robert, can you please provide your opinion
> on what do you think is the best way to move forward here?

I thought we had agreed on handling partitioned and unpartitioned
tables differently, but maybe I misunderstood the discussion.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

24 July 2021, 09:52:01

On Fri, Jul 23, 2021 at 6:55 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jul 21, 2021 at 11:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > I see here we have a mix of opinions from various people. Dilip seems
> > to be favoring the approach where we provide some option to the user
> > for partitioned tables and automatic behavior for non-partitioned
> > tables but he also seems to have mild concerns about this behavior.
> > OTOH, Greg and Hou-San seem to favor an approach where we can provide
> > an option to the user for both partitioned and non-partitioned tables.
> > I am also in favor of providing an option to the user for the sake of
> > consistency in behavior and not trying to introduce a special kind of
> > invalidation as it doesn't serve the purpose for partitioned tables.
> > Robert seems to be in favor of automatic behavior but it is not very
> > clear to me if he is fine with dealing differently for partitioned and
> > non-partitioned relations. Robert, can you please provide your opinion
> > on what do you think is the best way to move forward here?
>
> I thought we had agreed on handling partitioned and unpartitioned
> tables differently, but maybe I misunderstood the discussion.
>

I think for the consistency argument how about allowing users to
specify a parallel-safety option for both partitioned and
non-partitioned relations but for non-partitioned relations if users
didn't specify, it would be computed automatically? If the user has
specified parallel-safety option for non-partitioned relation then we
would consider that instead of computing the value by ourselves.

Another reason for hesitation to do automatically for non-partitioned
relations was the new invalidation which will invalidate the cached
parallel-safety for all relations in relcache for a particular
database. As mentioned by Hou-San [1], it seems we need to do this
whenever any function's parallel-safety is changed. OTOH, changing
parallel-safety for a function is probably not that often to matter in
practice which is why I think you seem to be fine with this idea. So,
I think, on that premise, it is okay to go ahead with different
handling for partitioned and non-partitioned relations here.

[1] -
https://www.postgresql.org/message-id/OS0PR01MB5716EC1D07ACCA24373C2557941B9%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Robert Haas

Date:

26 July 2021, 15:02:48

On Sat, Jul 24, 2021 at 5:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> I think for the consistency argument how about allowing users to
> specify a parallel-safety option for both partitioned and
> non-partitioned relations but for non-partitioned relations if users
> didn't specify, it would be computed automatically? If the user has
> specified parallel-safety option for non-partitioned relation then we
> would consider that instead of computing the value by ourselves.

Having the option for both partitioned and non-partitioned tables
doesn't seem like the worst idea ever, but I am also not entirely sure
that I understand the point.

> Another reason for hesitation to do automatically for non-partitioned
> relations was the new invalidation which will invalidate the cached
> parallel-safety for all relations in relcache for a particular
> database. As mentioned by Hou-San [1], it seems we need to do this
> whenever any function's parallel-safety is changed. OTOH, changing
> parallel-safety for a function is probably not that often to matter in
> practice which is why I think you seem to be fine with this idea.

Right. I think it should be quite rare, and invalidation events are
also not crazy expensive. We can test what the worst case is, but if
you have to sit there and run ALTER FUNCTION in a tight loop to see a
measurable performance impact, it's not a real problem. There may be a
code complexity argument against trying to figure it out
automatically, perhaps, but I don't think there's a big performance
issue.

What bothers me is that if this is something people have to set
manually then many people won't and will not get the benefit of the
feature. And some of them will also set it incorrectly and have
problems. So I am in favor of trying to determine it automatically
where possible, to make it easy for people. However, other people may
feel differently, and I'm not trying to say they're necessarily wrong.
I'm just telling you what I think.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

27 July 2021, 05:14:14

On Mon, Jul 26, 2021 at 8:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Jul 24, 2021 at 5:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > I think for the consistency argument how about allowing users to
> > specify a parallel-safety option for both partitioned and
> > non-partitioned relations but for non-partitioned relations if users
> > didn't specify, it would be computed automatically? If the user has
> > specified parallel-safety option for non-partitioned relation then we
> > would consider that instead of computing the value by ourselves.
>
> Having the option for both partitioned and non-partitioned tables
> doesn't seem like the worst idea ever, but I am also not entirely sure
> that I understand the point.
>

Consider below ways to allow the user to specify the parallel-safety option:

(a)
CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..

OR

(b)
CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)

The point was what should we do if the user specifies the option for a
non-partitioned table. Do we just ignore it or give an error that this
is not a valid syntax/option when used with non-partitioned tables? I
find it slightly odd that this option works for partitioned tables but
gives an error for non-partitioned tables but maybe we can document
it.

With the above syntax, even if the user doesn't specify the
parallelism option for non-partitioned relations, we will determine it
automatically. Now, in some situations, users might want to force
parallelism even when we wouldn't have chosen it automatically. It is
possible that she might face an error due to some parallel-unsafe
function but OTOH, she might have ensured that it is safe to choose
parallelism in her particular case.

> > Another reason for hesitation to do automatically for non-partitioned
> > relations was the new invalidation which will invalidate the cached
> > parallel-safety for all relations in relcache for a particular
> > database. As mentioned by Hou-San [1], it seems we need to do this
> > whenever any function's parallel-safety is changed. OTOH, changing
> > parallel-safety for a function is probably not that often to matter in
> > practice which is why I think you seem to be fine with this idea.
>
> Right. I think it should be quite rare, and invalidation events are
> also not crazy expensive. We can test what the worst case is, but if
> you have to sit there and run ALTER FUNCTION in a tight loop to see a
> measurable performance impact, it's not a real problem. There may be a
> code complexity argument against trying to figure it out
> automatically, perhaps, but I don't think there's a big performance
> issue.
>

True, there could be some code complexity but I think we can see once
the patch is ready for review.

> What bothers me is that if this is something people have to set
> manually then many people won't and will not get the benefit of the
> feature. And some of them will also set it incorrectly and have
> problems. So I am in favor of trying to determine it automatically
> where possible, to make it easy for people. However, other people may
> feel differently, and I'm not trying to say they're necessarily wrong.
> I'm just telling you what I think.
>

Thanks for all your suggestions and feedback.

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Dilip Kumar

Date:

27 July 2021, 05:58:32

On Tue, Jul 27, 2021 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 26, 2021 at 8:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Consider below ways to allow the user to specify the parallel-safety option:
>
> (a)
> CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
> ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..
>
> OR
>
> (b)
> CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
> ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)
>
> The point was what should we do if the user specifies the option for a
> non-partitioned table. Do we just ignore it or give an error that this
> is not a valid syntax/option when used with non-partitioned tables? I
> find it slightly odd that this option works for partitioned tables but
> gives an error for non-partitioned tables but maybe we can document
> it.

IMHO, for a non-partitioned table, we should be default allow the
parallel safely checking so that users don't have to set it for
individual tables, OTOH, I don't think that there is any point in
blocking the syntax for the non-partitioned table, So I think for the
non-partitioned table if the user hasn't set it we should do automatic
safety checking and if the user has defined the safety externally then
we should respect that.  And for the partitioned table, we will never
do the automatic safety checking and we should always respect what the
user has set.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

27 July 2021, 08:35:51

On Tue, Jul 27, 2021 at 11:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jul 27, 2021 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jul 26, 2021 at 8:33 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > Consider below ways to allow the user to specify the parallel-safety option:
> >
> > (a)
> > CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
> > ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..
> >
> > OR
> >
> > (b)
> > CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
> > ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)
> >
> > The point was what should we do if the user specifies the option for a
> > non-partitioned table. Do we just ignore it or give an error that this
> > is not a valid syntax/option when used with non-partitioned tables? I
> > find it slightly odd that this option works for partitioned tables but
> > gives an error for non-partitioned tables but maybe we can document
> > it.
>
> IMHO, for a non-partitioned table, we should be default allow the
> parallel safely checking so that users don't have to set it for
> individual tables, OTOH, I don't think that there is any point in
> blocking the syntax for the non-partitioned table, So I think for the
> non-partitioned table if the user hasn't set it we should do automatic
> safety checking and if the user has defined the safety externally then
> we should respect that.  And for the partitioned table, we will never
> do the automatic safety checking and we should always respect what the
> user has set.
>

This is exactly what I am saying. BTW, do you have any preference for
the syntax among (a) or (b)?

-- 
With Regards,
Amit Kapila.

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Greg Nancarrow

Date:

27 July 2021, 10:30:29

On Tue, Jul 27, 2021 at 3:58 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> IMHO, for a non-partitioned table, we should be default allow the
> parallel safely checking so that users don't have to set it for
> individual tables, OTOH, I don't think that there is any point in
> blocking the syntax for the non-partitioned table, So I think for the
> non-partitioned table if the user hasn't set it we should do automatic
> safety checking and if the user has defined the safety externally then
> we should respect that.  And for the partitioned table, we will never
> do the automatic safety checking and we should always respect what the
> user has set.
>

Provided it is possible to distinguish between the default
parallel-safety (unsafe) and that default being explicitly specified
by the user, it should be OK.
In the case of performing the automatic parallel-safety checking and
the table using something that is parallel-unsafe, there will be a
performance degradation compared to the current code (hopefully only
small). That can be avoided by the user explicitly specifying that
it's parallel-unsafe.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Amit Kapila

Date:

27 July 2021, 11:36:37

On Tue, Jul 27, 2021 at 4:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Jul 27, 2021 at 3:58 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > IMHO, for a non-partitioned table, we should be default allow the
> > parallel safely checking so that users don't have to set it for
> > individual tables, OTOH, I don't think that there is any point in
> > blocking the syntax for the non-partitioned table, So I think for the
> > non-partitioned table if the user hasn't set it we should do automatic
> > safety checking and if the user has defined the safety externally then
> > we should respect that.  And for the partitioned table, we will never
> > do the automatic safety checking and we should always respect what the
> > user has set.
> >
>
> Provided it is possible to distinguish between the default
> parallel-safety (unsafe) and that default being explicitly specified
> by the user, it should be OK.
>

Offhand, I don't see any problem with this. Do you have something
specific in mind?

> In the case of performing the automatic parallel-safety checking and
> the table using something that is parallel-unsafe, there will be a
> performance degradation compared to the current code (hopefully only
> small). That can be avoided by the user explicitly specifying that
> it's parallel-unsafe.
>

True, but I guess this should be largely addressed by caching the
value of parallel safety at the relation level. Sure, there will be
some cost the first time we compute it but on consecutive accesses, it
should be quite cheap.

-- 
With Regards,
Amit Kapila.

RE: [bug?] Missed parallel safety checks, and wrong parallel safety

From

"houzj.fnst@fujitsu.com"

Date:

28 July 2021, 02:52:39

On July 27, 2021 1:14 PM Amit Kapila <amit.kapila16@gmail.com>
> On Mon, Jul 26, 2021 at 8:33 PM Robert Haas <robertmhaas@gmail.com>
> wrote:
> >
> > On Sat, Jul 24, 2021 at 5:52 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > > I think for the consistency argument how about allowing users to
> > > specify a parallel-safety option for both partitioned and
> > > non-partitioned relations but for non-partitioned relations if users
> > > didn't specify, it would be computed automatically? If the user has
> > > specified parallel-safety option for non-partitioned relation then we
> > > would consider that instead of computing the value by ourselves.
> >
> > Having the option for both partitioned and non-partitioned tables
> > doesn't seem like the worst idea ever, but I am also not entirely sure
> > that I understand the point.
> >
> 
> Consider below ways to allow the user to specify the parallel-safety option:
> 
> (a)
> CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
> ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..
> 
> OR
> 
> (b)
> CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
> ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)

Personally, I think the approach (a) might be better. Since it's similar to
ALTER FUNCTION PARALLEL XXX which user might be more familiar with.

Besides, I think we need a new default value about parallel dml safety. Maybe
'auto' or 'null'(different from safe/restricted/unsafe). Because, user is
likely to alter the safety to the default value to get the automatic safety
check, a independent default value can make it more clear.

Best regards,
Houzj

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From

Greg Nancarrow

Date:

28 July 2021, 03:20:47

On Wed, Jul 28, 2021 at 12:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> > Consider below ways to allow the user to specify the parallel-safety option:
> >
> > (a)
> > CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
> > ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..
> >
> > OR
> >
> > (b)
> > CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
> > ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)
>
> Personally, I think the approach (a) might be better. Since it's similar to
> ALTER FUNCTION PARALLEL XXX which user might be more familiar with.
>

I think so too.

> Besides, I think we need a new default value about parallel dml safety. Maybe
> 'auto' or 'null'(different from safe/restricted/unsafe). Because, user is
> likely to alter the safety to the default value to get the automatic safety
> check, a independent default value can make it more clear.
>

Yes, I was thinking something similar when I said "Provided it is
possible to distinguish between the default parallel-safety (unsafe)
and that default being explicitly specified by the user". If we don't
have a new default value, then we need to distinguish these cases, but
I'm not sure Postgres does something similar elsewhere (for example,
for function parallel-safety, it's not currently recorded whether
parallel-safety=unsafe is because of the default or because the user
specifically set it to what is the default value).
Opinions?

Regards,
Greg Nancarrow
Fujitsu Australia

Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

Amit Kapila

Date:

30 July 2021, 06:02:40

Note: Changing the subject as I felt the topic has diverted from the
original reported case and also it might help others to pay attention.

On Wed, Jul 28, 2021 at 8:22 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
> >
> > Consider below ways to allow the user to specify the parallel-safety option:
> >
> > (a)
> > CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ...
> > ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE } ..
> >
> > OR
> >
> > (b)
> > CREATE TABLE table_name (...) WITH (parallel_dml_enabled = true)
> > ALTER TABLE table_name (...) WITH (parallel_dml_enabled = true)
>
> Personally, I think the approach (a) might be better. Since it's similar to
> ALTER FUNCTION PARALLEL XXX which user might be more familiar with.
>

Okay, and I think for (b) true/false won't be sufficient because one
might want to specify restricted.

> Besides, I think we need a new default value about parallel dml safety. Maybe
> 'auto' or 'null'(different from safe/restricted/unsafe). Because, user is
> likely to alter the safety to the default value to get the automatic safety
> check, a independent default value can make it more clear.
>

Hmm, but auto won't work for partitioned tables, right? If so, that
might appear like an inconsistency to the user and we need to document
the same. Let me summarize the discussion so far in this thread so
that it is helpful to others.

We would like to parallelize INSERT SELECT (first step INSERT +
parallel SELECT and then Parallel (INSERT + SELECT)) and for that, we
have explored a couple of ways. The first approach is to automatically
detect if it is safe to parallelize insert and then do it without user
intervention. To detect automatically, we need to determine the
parallel-safety of various expressions (like default column
expressions, check constraints, index expressions, etc.) at the
planning time which can be costly but we can avoid most of the cost if
we cache the parallel safety for the relation. So, the cost needs to
be paid just once. Now, we can't cache this for partitioned relations
because it can be very costly (as we need to lock all the partitions)
and has deadlock risks (while processing invalidation), this has been
explained in email [1].

Now, as we can't think of a nice way to determine parallel safety
automatically for partitioned relations, we thought of providing an
option to the user. The next thing to decide here is that if we are
providing an option to the user in one of the ways as mentioned above
in the email, what should we do if the user uses that option for
non-partitioned relations, shall we just ignore it or give an error
that this is not a valid syntax/option? The one idea which Dilip and I
are advocating is to respect the user's input for non-partitioned
relations and if it is not given then compute the parallel safety and
cache it.

To facilitate users for providing a parallel-safety option, we are
thinking to provide a utility function
"pg_get_table_parallel_dml_safety(regclass)" that
returns records of (objid, classid, parallel_safety) for all parallel
unsafe/restricted table-related objects from which the table's
parallel DML safety is determined. This will allow user to identify
unsafe objects and if the required user can change the parallel safety
of required functions and then use the parallel safety option for the
table.

Thoughts?

Note - This topic has been discussed in another thread as well [2] but
as many of the key technical points have been discussed here I thought
it is better to continue here.

[1] - https://www.postgresql.org/message-id/CAA4eK1Jwz8xGss4b0-33eyX0i5W_1CnqT16DjB9snVC--DoOsQ%40mail.gmail.com
[2] -
https://www.postgresql.org/message-id/TYAPR01MB29905A9AB82CC8BA50AB0F80FE709%40TYAPR01MB2990.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.

Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

Greg Nancarrow

Date:

30 July 2021, 06:52:08

On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Besides, I think we need a new default value about parallel dml safety. Maybe
> > 'auto' or 'null'(different from safe/restricted/unsafe). Because, user is
> > likely to alter the safety to the default value to get the automatic safety
> > check, a independent default value can make it more clear.
> >
>
> Hmm, but auto won't work for partitioned tables, right? If so, that
> might appear like an inconsistency to the user and we need to document
> the same. Let me summarize the discussion so far in this thread so
> that it is helpful to others.
>

To avoid that inconsistency, UNSAFE could be the default for
partitioned tables (and we would disallow setting AUTO for these).
So then AUTO is the default for non-partitioned tables only.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

30 July 2021, 13:23:40

On Friday, July 30, 2021 2:52 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Besides, I think we need a new default value about parallel dml
> > > safety. Maybe 'auto' or 'null'(different from
> > > safe/restricted/unsafe). Because, user is likely to alter the safety
> > > to the default value to get the automatic safety check, a independent default
> > > value can make it more clear.
> > >
> >
> > Hmm, but auto won't work for partitioned tables, right? If so, that
> > might appear like an inconsistency to the user and we need to document
> > the same. Let me summarize the discussion so far in this thread so
> > that it is helpful to others.
> >
> 
> To avoid that inconsistency, UNSAFE could be the default for partitioned tables
> (and we would disallow setting AUTO for these).
> So then AUTO is the default for non-partitioned tables only.

I think this approach is reasonable, +1.

Best regards,
houzj

Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

Amit Kapila

Date:

02 August 2021, 04:52:32

On Fri, Jul 30, 2021 at 6:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, July 30, 2021 2:52 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > Besides, I think we need a new default value about parallel dml
> > > > safety. Maybe 'auto' or 'null'(different from
> > > > safe/restricted/unsafe). Because, user is likely to alter the safety
> > > > to the default value to get the automatic safety check, a independent default
> > > > value can make it more clear.
> > > >
> > >
> > > Hmm, but auto won't work for partitioned tables, right? If so, that
> > > might appear like an inconsistency to the user and we need to document
> > > the same. Let me summarize the discussion so far in this thread so
> > > that it is helpful to others.
> > >
> >
> > To avoid that inconsistency, UNSAFE could be the default for partitioned tables
> > (and we would disallow setting AUTO for these).
> > So then AUTO is the default for non-partitioned tables only.
>
> I think this approach is reasonable, +1.
>

I see the need to change to default via Alter Table but I am not sure
if Auto is the most appropriate way to handle that. How about using
DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users
have to alter parallel safety value to default, they need to just say
Parallel DML DEFAULT. The default would mean automatic behavior for
non-partitioned relations and ignore parallelism for partitioned
tables.

-- 
With Regards,
Amit Kapila.

Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

Greg Nancarrow

Date:

02 August 2021, 06:04:18

On Mon, Aug 2, 2021 at 2:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jul 30, 2021 at 6:53 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Friday, July 30, 2021 2:52 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > > Besides, I think we need a new default value about parallel dml
> > > > > safety. Maybe 'auto' or 'null'(different from
> > > > > safe/restricted/unsafe). Because, user is likely to alter the safety
> > > > > to the default value to get the automatic safety check, a independent default
> > > > > value can make it more clear.
> > > > >
> > > >
> > > > Hmm, but auto won't work for partitioned tables, right? If so, that
> > > > might appear like an inconsistency to the user and we need to document
> > > > the same. Let me summarize the discussion so far in this thread so
> > > > that it is helpful to others.
> > > >
> > >
> > > To avoid that inconsistency, UNSAFE could be the default for partitioned tables
> > > (and we would disallow setting AUTO for these).
> > > So then AUTO is the default for non-partitioned tables only.
> >
> > I think this approach is reasonable, +1.
> >
>
> I see the need to change to default via Alter Table but I am not sure
> if Auto is the most appropriate way to handle that. How about using
> DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users
> have to alter parallel safety value to default, they need to just say
> Parallel DML DEFAULT. The default would mean automatic behavior for
> non-partitioned relations and ignore parallelism for partitioned
> tables.
>

Hmm, I'm not so sure I'm sold on that.
I personally think "DEFAULT" here is vague, and users then need to
know what DEFAULT equates to, based on the type of table (partitioned
or non-partitioned table).
Also, then there's two ways to set the actual "default" DML
parallel-safety for partitioned tables: DEFAULT or UNSAFE.
At least "AUTO" is a meaningful default option name for
non-partitioned tables - "automatic" parallel-safety checking, and the
fact that it isn't the default (and can't be set) for partitioned
tables highlights the difference in the way being proposed to treat
them (i.e. use automatic checking only for non-partitioned tables).
I'd be interested to hear what others think.
I think a viable alternative would be to record whether an explicit
DML parallel-safety has been specified, and if not, apply default
behavior (i.e. by default use automatic checking for non-partitioned
tables and treat partitioned tables as UNSAFE). I'm just not sure
whether this kind of distinction (explicit vs implicit default) has
been used before in Postgres options.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

02 August 2021, 06:30:21

On August 2, 2021 2:04 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> On Mon, Aug 2, 2021 at 2:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jul 30, 2021 at 6:53 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Friday, July 30, 2021 2:52 PM Greg Nancarrow <gregn4422@gmail.com>
> wrote:
> > > > On Fri, Jul 30, 2021 at 4:02 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > > > >
> > > > > > Besides, I think we need a new default value about parallel
> > > > > > dml safety. Maybe 'auto' or 'null'(different from
> > > > > > safe/restricted/unsafe). Because, user is likely to alter the
> > > > > > safety to the default value to get the automatic safety check,
> > > > > > a independent default value can make it more clear.
> > > > > >
> > > > >
> > > > > Hmm, but auto won't work for partitioned tables, right? If so,
> > > > > that might appear like an inconsistency to the user and we need
> > > > > to document the same. Let me summarize the discussion so far in
> > > > > this thread so that it is helpful to others.
> > > > >
> > > >
> > > > To avoid that inconsistency, UNSAFE could be the default for
> > > > partitioned tables (and we would disallow setting AUTO for these).
> > > > So then AUTO is the default for non-partitioned tables only.
> > >
> > > I think this approach is reasonable, +1.
> > >
> >
> > I see the need to change to default via Alter Table but I am not sure
> > if Auto is the most appropriate way to handle that. How about using
> > DEFAULT itself as we do in the case of REPLICA IDENTITY? So, if users
> > have to alter parallel safety value to default, they need to just say
> > Parallel DML DEFAULT. The default would mean automatic behavior for
> > non-partitioned relations and ignore parallelism for partitioned
> > tables.
> >
> 
> Hmm, I'm not so sure I'm sold on that.
> I personally think "DEFAULT" here is vague, and users then need to know what
> DEFAULT equates to, based on the type of table (partitioned or non-partitioned
> table).
> Also, then there's two ways to set the actual "default" DML parallel-safety for
> partitioned tables: DEFAULT or UNSAFE.
> At least "AUTO" is a meaningful default option name for non-partitioned tables
> - "automatic" parallel-safety checking, and the fact that it isn't the default (and
> can't be set) for partitioned tables highlights the difference in the way being
> proposed to treat them (i.e. use automatic checking only for non-partitioned
> tables).
> I'd be interested to hear what others think.
> I think a viable alternative would be to record whether an explicit DML
> parallel-safety has been specified, and if not, apply default behavior (i.e. by
> default use automatic checking for non-partitioned tables and treat partitioned
> tables as UNSAFE). I'm just not sure whether this kind of distinction (explicit vs
> implicit default) has been used before in Postgres options.

I think both approaches are fine, but using "DEFAULT" might has a disadvantage
that if we somehow support automatic safety check for partitioned table in the
future, then the meaning of "DEFAULT" for partitioned table will change from
UNSAFE to automatic check. It could also bring some burden on the user to
modify their sql script.

Best regards,
houzj

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

03 August 2021, 07:40:22

Based on the discussion here, I implemented the auto-safety-check feature.
Since most of the technical discussion happened here,I attatched the patches in
this thread.

The patches allow users to specify a parallel-safety option for both
partitioned and non-partitioned relations, and for non-partitioned relations if
users didn't specify, it would be computed automatically. If the user has
specified parallel-safety option then we would consider that instead of
computing the value by ourselves. But for partitioned table, if users didn't
specify the parallel dml safety, it will treat is as unsafe.

For non-partitioned relations, after computing the parallel-safety of relation
during the planning, we save it in the relation cache entry and invalidate the
cached parallel-safety for all relations in relcache for a particular database
whenever any function's parallel-safety is changed.

To make it possible for user to alter the safety to a not specified value to
get the automatic safety check, add a new default option(temporarily named
'DEFAULT' in addition to safe/unsafe/restricted) about parallel dml safety.

To facilitate users for providing a parallel-safety option, provide a utility
functionr "pg_get_table_parallel_dml_safety(regclass)" that returns records of
(objid, classid, parallel_safety) for all parallel unsafe/restricted
table-related objects from which the table's parallel DML safety is determined.
This will allow user to identify unsafe objects and if the required user can
change the parallel safety of required functions and then use the parallel
safety option for the table.

Best regards,
houzj

Attachment

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

06 August 2021, 08:23:09

On Tues, August 3, 2021 3:40 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> Based on the discussion here, I implemented the auto-safety-check feature.
> Since most of the technical discussion happened here,I attatched the patches in
> this thread.
> 
> The patches allow users to specify a parallel-safety option for both partitioned
> and non-partitioned relations, and for non-partitioned relations if users didn't
> specify, it would be computed automatically. If the user has specified
> parallel-safety option then we would consider that instead of computing the
> value by ourselves. But for partitioned table, if users didn't specify the parallel
> dml safety, it will treat is as unsafe.
> 
> For non-partitioned relations, after computing the parallel-safety of relation
> during the planning, we save it in the relation cache entry and invalidate the
> cached parallel-safety for all relations in relcache for a particular database
> whenever any function's parallel-safety is changed.
> 
> To make it possible for user to alter the safety to a not specified value to get the
> automatic safety check, add a new default option(temporarily named 'DEFAULT'
> in addition to safe/unsafe/restricted) about parallel dml safety.
> 
> To facilitate users for providing a parallel-safety option, provide a utility
> functionr "pg_get_table_parallel_dml_safety(regclass)" that returns records of
> (objid, classid, parallel_safety) for all parallel unsafe/restricted table-related
> objects from which the table's parallel DML safety is determined.
> This will allow user to identify unsafe objects and if the required user can change
> the parallel safety of required functions and then use the parallel safety option
> for the table.

Update the commit message in patches to make it easier for others to review.

Best regards,
Houzj

Attachment

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

19 August 2021, 08:16:11

On Fri, Aug 6, 2021 4:23 PM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> 
> Update the commit message in patches to make it easier for others to review.

CFbot reported a compile error due to recent commit 3aafc03.
Attach rebased patches which fix the error.

Best regards,
Hou zj

Attachment

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

01 September 2021, 09:23:48

Thursday, August 19, 2021 4:16 PM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> On Fri, Aug 6, 2021 4:23 PM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> >
> > Update the commit message in patches to make it easier for others to review.
> 
> CFbot reported a compile error due to recent commit 3aafc03.
> Attach rebased patches which fix the error.

The patch can't apply to the HEAD branch due a recent commit.
Attach rebased patches.

Best regards,
Hou zj

Attachment

RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

"houzj.fnst@fujitsu.com"

Date:

09 September 2021, 02:12:08

From: Wednesday, September 1, 2021 5:24 PM Hou Zhijie<houzj.fnst@fujitsu.com>
> Thursday, August 19, 2021 4:16 PM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> > On Fri, Aug 6, 2021 4:23 PM Hou zhijie <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Update the commit message in patches to make it easier for others to
> review.
> >
> > CFbot reported a compile error due to recent commit 3aafc03.
> > Attach rebased patches which fix the error.
> 
> The patch can't apply to the HEAD branch due a recent commit.
> Attach rebased patches.

In the past, the rewriter could generate a re-written query with a modifying
CTE does not have hasModifyingCTE flag set and this bug cause the regression
test(force_parallel_mode=regress) failure when enable parallel select for
insert, so , we had a workaround 0006.patch for it. But now, the bug has been
fixed in commit 362e2d and we don't need the workaround patch anymore.

Attach new version patch set which remove the workaround patch.

Best regards,
Hou zj

Attachment

Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

Julien Rouhaud

Date:

14 January 2022, 12:14:02

Hi,

On Thu, Sep 09, 2021 at 02:12:08AM +0000, houzj.fnst@fujitsu.com wrote:
> 
> Attach new version patch set which remove the workaround patch.

This version of the patchset doesn't apply anymore:

http://cfbot.cputube.org/patch_36_3143.log
=== Applying patches on top of PostgreSQL commit ID a18b6d2dc288dfa6e7905ede1d4462edd6a8af47 ===
=== applying patch ./v19-0001-CREATE-ALTER-TABLE-PARALLEL-DML.patch
[...]
patching file src/backend/commands/tablecmds.c
Hunk #1 FAILED at 40.
Hunk #2 succeeded at 624 (offset 21 lines).
Hunk #3 succeeded at 670 (offset 21 lines).
Hunk #4 succeeded at 947 (offset 19 lines).
Hunk #5 succeeded at 991 (offset 19 lines).
Hunk #6 succeeded at 4256 (offset 40 lines).
Hunk #7 succeeded at 4807 (offset 40 lines).
Hunk #8 succeeded at 5217 (offset 40 lines).
Hunk #9 succeeded at 6193 (offset 42 lines).
Hunk #10 succeeded at 19278 (offset 465 lines).
1 out of 10 hunks FAILED -- saving rejects to file src/backend/commands/tablecmds.c.rej
[...]
patching file src/bin/pg_dump/pg_dump.c
Hunk #1 FAILED at 6253.
Hunk #2 FAILED at 6358.
Hunk #3 FAILED at 6450.
Hunk #4 FAILED at 6503.
Hunk #5 FAILED at 6556.
Hunk #6 FAILED at 6609.
Hunk #7 FAILED at 6660.
Hunk #8 FAILED at 6708.
Hunk #9 FAILED at 6756.
Hunk #10 FAILED at 6803.
Hunk #11 FAILED at 6872.
Hunk #12 FAILED at 6927.
Hunk #13 succeeded at 15524 (offset -1031 lines).
12 out of 13 hunks FAILED -- saving rejects to file src/bin/pg_dump/pg_dump.c.rej
[...]
patching file src/bin/psql/describe.c
Hunk #1 succeeded at 1479 (offset -177 lines).
Hunk #2 succeeded at 1493 (offset -177 lines).
Hunk #3 succeeded at 1631 (offset -241 lines).
Hunk #4 succeeded at 3374 (offset -277 lines).
Hunk #5 succeeded at 3731 (offset -310 lines).
Hunk #6 FAILED at 4109.
1 out of 6 hunks FAILED -- saving rejects to file src/bin/psql/describe.c.rej

Could you send a rebased version?  In the meantime I will switch the entry to
Waiting on Author.

Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From

Jacob Champion

Date:

28 July 2022, 15:51:41

On Thu, Jul 28, 2022 at 8:43 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> Could you send a rebased version?  In the meantime I will switch the entry to
> Waiting on Author.

By request in [1] I'm marking this Returned with Feedback for now.
Whenever you're ready, you can resurrect the patch entry by visiting

    https://commitfest.postgresql.org/38/3143/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

Thanks,
--Jacob

[1]
https://www.postgresql.org/message-id/OS0PR01MB571696D623F35A09AB51903A94969%40OS0PR01MB5716.jpnprd01.prod.outlook.com