Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Zhihong Yu
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CALNJ-vR-c==GnesvChcM6K4bMAkjD6mSyMmLdW57mJH-sgKTyQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Zhihong Yu <zyu@yugabyte.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)
List pgsql-hackers
Hi,
For v12-0003-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch:

+       bool        isParallelModifyLeader = IsA(planstate, GatherState) && IsA(outerPlanState(planstate), ModifyTableState);

Please wrap long line.

+   uint64     *processed_count_space;

If I read the code correctly, it seems it can be dropped (use pei->processed_count directly).

Cheers



On Wed, Jan 20, 2021 at 6:29 PM Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
For v12-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch :

is found from the additional parallel-safety checks, or from the existing
parallel-safety checks for SELECT that it currently performs.

existing and 'it currently performs' are redundant. You can omit 'that it currently performs'.

Minor. For index_expr_max_parallel_hazard_for_modify(), 

+               if (keycol == 0)
+               {
+                   /* Found an index expression */

You can check if keycol != 0, continue with the loop. This would save some indent.

+                   if (index_expr_item == NULL)    /* shouldn't happen */
+                   {
+                       elog(WARNING, "too few entries in indexprs list");

I think the warning should be an error since there is assertion ahead of the if statement.

+           Assert(!isnull);
+           if (isnull)
+           {
+               /*
+                * This shouldn't ever happen, but if it does, log a WARNING
+                * and return UNSAFE, rather than erroring out.
+                */
+               elog(WARNING, "null conbin for constraint %u", con->oid);

The above should be error as well.

Cheers

On Wed, Jan 20, 2021 at 5:06 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
Thanks for the feedback.
Posting an updated set of patches. Changes are based on feedback, as
detailed below:

There's a couple of potential issues currently being looked at:
- locking issues in additional parallel-safety checks?
- apparent uneven work distribution across the parallel workers, for
large insert data


[Antonin]
- Fixed bad Assert in PrepareParallelMode()
- Added missing comment to explain use of GetCurrentCommandId() in
PrepareParallelMode()
- Some variable name shortening in a few places
- Created common function for creation of non-parallel and parallel
ModifyTable paths
- Path count variable changed to bool
- Added FIXME comment to dubious code for creating Gather target-list
from ModifyTable subplan
- Fixed check on returningLists to use NIL instead of NULL

[Amit]
- Moved additional parallel-safety checks (for modify case) into
max_parallel_hazard()
- Removed redundant calls to max_parallel_hazard_test()
- Added Asserts to "should never happen" null-attribute cases (and
added WARNING log missing from one case)
- Added comment for use of NoLock in max_parallel_hazard_for_modify()

[Vignesh]
- Fixed a couple of typos
- Added a couple of test cases for testing that the same transaction
is used by all parallel workers


Regards,
Greg Nancarrow
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [PATCH 1/1] Initial mach based shared memory support.
Next
From: Bharath Rupireddy
Date:
Subject: Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit