Re: Parallel INSERT SELECT take 2 - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Parallel INSERT SELECT take 2
Date
Msg-id CAJcOf-d2J0=fpCOHm_hDZ2g1L6qDyajUBtTRrokB0vwDOECpXg@mail.gmail.com
Whole thread Raw
In response to RE: Parallel INSERT SELECT take 2  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Parallel INSERT SELECT take 2
List pgsql-hackers
On Fri, May 14, 2021 at 6:24 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Thanks for the comments, I have posted new version patches with this change.
>
> > How about reorganisation of the patches like the following?
> > 0001: CREATE ALTER TABLE PARALLEL DML
> > 0002: parallel-SELECT-for-INSERT (planner changes,
> > max_parallel_hazard() update, XID changes)
> > 0003: pg_get_parallel_safety()
> > 0004: regression test updates
>
> Thanks, it looks good and I reorganized the latest patchset in this way.
>
> Attaching new version patches with the following change.
>
> 0003
> Change functions arg type to regclass.
>
> 0004
> remove updates for "serial_schedule".
>

I've got some comments for the V4 set of patches:

(0001)

(i) Patch comment needs a little updating (suggested change is below):

Enable users to declare a table's parallel data-modification safety
(SAFE/RESTRICTED/UNSAFE).

Add a table property that represents parallel safety of a table for
DML statement execution.
It may be specified as follows:

CREATE TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };
ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };

This property is recorded in pg_class's relparallel column as 'u',
'r', or 's', just like pg_proc's proparallel.
The default is UNSAFE.

The planner assumes that all of the table, its descendant partitions,
and their ancillary objects have,
at worst, the specified parallel safety. The user is responsible for
its correctness.

---

NOTE: The following sentence was removed from the original V4 0001
patch comment (since this version of the patch is not doing runtime
parallel-safety checks on functions):.

If the parallel processes
find an object that is less safer than the assumed parallel safety during
statement execution, it throws an ERROR and abort the statement execution.


(ii) Update message to say "a foreign ...":

BEFORE:
+ errmsg("cannot support parallel data modification on foreign or
temporary table")));

AFTER:
+ errmsg("cannot support parallel data modification on a foreign or
temporary table")));


(iii) strVal() macro already casts to "Value *", so the cast can be
removed from the following:

+ char    *parallel = strVal((Value *) def);


(0003)

(i) Suggested updates to the patch comment:

Provide a utility function "pg_get_parallel_safety(regclass)" that
returns records of
(objid, classid, parallel_safety) for all parallel unsafe/restricted
table-related objects
from which the table's parallel DML safety is determined. The user can
use this information
during development in order to accurately declare a table's parallel
DML safety. or to
identify any problematic objects if a parallel DML fails or behaves
unexpectedly.

When the use of an index-related parallel unsafe/restricted function
is detected, both the
function oid and the index oid are returned.

Provide a utility function "pg_get_max_parallel_hazard(regclass)" that
returns the worst
parallel DML safety hazard that can be found in the given relation.
Users can use this
function to do a quick check without caring about specific
parallel-related objects.


Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Inaccurate error message when set fdw batch_size to 0
Next
From: Mathis Rudolf
Date:
Subject: Alias collision in `refresh materialized view concurrently`