Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From Zhihong Yu
Subject Re: Parallel Inserts in CREATE TABLE AS
Date
Msg-id CALNJ-vSep-B7A2WP-6XDU1GLb6Ao8_M3Gosydv+b9YgQ+3X1oA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Parallel Inserts in CREATE TABLE AS
List pgsql-hackers
Hi,
For v20-0002-Tuple-Cost-Adjustment-for-Parallel-Inserts-in-CTAS.patch :

workers to Gather node to 0. With this change, there are chances
that the planner may choose the parallel plan.

It would be nice if the scenarios where parallel plan is not chosen are listed.

+   if ((root->parse->parallelInsCmdTupleCostOpt &
+        PARALLEL_INSERT_SELECT_QUERY) &&
+       (root->parse->parallelInsCmdTupleCostOpt &
+        PARALLEL_INSERT_CAN_IGN_TUP_COST))
+   {
+       /* We are ignoring the parallel tuple cost, so mark it. */
+       root->parse->parallelInsCmdTupleCostOpt |=
+                                           PARALLEL_INSERT_TUP_COST_IGNORED;

If I read the code correctly, when both PARALLEL_INSERT_SELECT_QUERY and PARALLEL_INSERT_CAN_IGN_TUP_COST are set, PARALLEL_INSERT_TUP_COST_IGNORED is implied.

Maybe we don't need the PARALLEL_INSERT_TUP_COST_IGNORED enum - the setting (1) of the first two bits should suffice.

Cheers

On Mon, Jan 4, 2021 at 7:59 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, Jan 4, 2021 at 7:02 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> > +                                       if (IS_PARALLEL_CTAS_DEST(gstate->dest) &&
> > +                                               ((DR_intorel *) gstate->dest)->into->rel &&
> > +                                               ((DR_intorel *) gstate->dest)->into->rel->relname)
> > why would rel and relname not be there? if no rows have been inserted?
> > because it seems from the intorel_startup function that that would be
> > set as soon as startup was done, which i assume (wrongly?) is always done?
>
> Actually, that into clause rel variable is always being set in the gram.y for CTAS, Create Materialized View and SELECT INTO (because qualified_name non-terminal is not optional). My bad. I just added it as a sanity check. Actually, it's not required.
>
> create_as_target:
>             qualified_name opt_column_list table_access_method_clause
>             OptWith OnCommitOption OptTableSpace
>                 {
>                     $$ = makeNode(IntoClause);
>                     $$->rel = $1;
> create_mv_target:
>             qualified_name opt_column_list table_access_method_clause opt_reloptions OptTableSpace
>             {
>                 $$ = makeNode(IntoClause);
>                 $$->rel = $1;
> into_clause:
>             INTO OptTempTableName
>             {
>                 $$ = makeNode(IntoClause);
>                $$->rel = $2;
>
> I will change the below code:
> +                    if (GetParallelInsertCmdType(gstate->dest) ==
> +                        PARALLEL_INSERT_CMD_CREATE_TABLE_AS &&
> +                        ((DR_intorel *) gstate->dest)->into &&
> +                        ((DR_intorel *) gstate->dest)->into->rel &&
> +                        ((DR_intorel *) gstate->dest)->into->rel->relname)
> +                    {
>
> to:
> +                    if (GetParallelInsertCmdType(gstate->dest) ==
> +                        PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +                    {
>
> I will update this in the next version of the patch set.

Attaching v20 patch set that has above change in 0001 patch, note that
0002 to 0004 patches have no changes from v19. Please consider the v20
patch set for further review.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: list of extended statistics on psql
Next
From: Tom Lane
Date:
Subject: Re: Terminate the idle sessions