Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Parallel Inserts in CREATE TABLE AS
Date
Msg-id CALj2ACX+i-vtmdrS3kzLVJx=3TQLXryoF3zWWgZ-8Qvo3RJyFw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Inserts in CREATE TABLE AS  (Zhihong Yu <zyu@yugabyte.com>)
List pgsql-hackers
On Mon, Dec 28, 2020 at 1:16 AM Zhihong Yu <zyu@yugabyte.com> wrote:
> For v16-0002-Tuple-Cost-Adjustment-for-Parallel-Inserts-in-CTAS.patch:
>
> +       if (ignore &&
> +           (root->parse->CTASParallelInsInfo &
> +            CTAS_PARALLEL_INS_TUP_COST_CAN_IGN))
>
> I wonder why CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is checked again in the above if since when
ignore_parallel_tuple_costreturns true, CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is set already.
 

Sometimes, we may set the flag CTAS_PARALLEL_INS_TUP_COST_CAN_IGN
before generate_useful_gather_paths, but the
generate_useful_gather_paths can return without reaching cost_gather
where we reset. The generate_useful_gather_paths can return without
reaching cost_gather, in following case

if (rel->partial_pathlist == NIL)
    return;

So, for such cases, I'm resetting it here.

> + * In this function we only care Append and Gather nodes.
>
> 'care' -> 'care about'

Done.

> +       for (int i = 0; i < aps->as_nplans; i++)
> +       {
> +           parallel |= PushDownCTASParallelInsertState(dest,
> +                                                       aps->appendplans[i],
> +                                                       gather_exists);
>
> It seems the loop termination condition can include parallel since we can come out of the loop once parallel is
true.

No, we can not come out of the for loop if parallel is true, because
our intention there is to look for all the child/sub plans under
Append, and push the inserts to the Gather nodes wherever possible.

> +   if (!allow && tuple_cost_flags && gather_exists)
>
> As the above code shows, gather_exists is only checked when allow is false.

Yes, if at least one gather node exists under the Append for which the
planner would have ignored the tuple cost, and now if we don't allow
parallel inserts, we should assert that the parallelism is not picked
because of wrong parallel tuple cost enforcement.

> +            * We set the flag for two cases when there is no parent path will
> +            * be created(such as : limit,sort,distinct...):
>
> Please correct the grammar : there are two verbs following 'when'

Done.

> For set_append_rel_size:
>
> +           {
> +               root->parse->CTASParallelInsInfo |=
> +                                       CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND;
> +           }
> +       }
> +
> +       if (root->parse->CTASParallelInsInfo &
> +           CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND)
> +       {
> +           root->parse->CTASParallelInsInfo &=
> +                                       ~CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND;
>
> In the if block for childrel->rtekind == RTE_SUBQUERY, CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND maybe set. Why is it
clearedimmediately after ?
 

Thanks for pointing that out. It's a miss, intention is to reset it
after set_rel_size(). Corrected in the v17 patch.

> +   /* Set to this in case tuple cost needs to be ignored for Append cases. */
> +   CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND = 1 << 3
>
> Since each CTAS_PARALLEL_INS_ flag is a bit, maybe it's better to use 'turn on' or similar term in the comment.
Because'set to' normally means assignment.
 

Done.

All the above comments are addressed in the v17 patch set posted
upthread. Please have a look.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Parallel Inserts in CREATE TABLE AS
Next
From: Bharath Rupireddy
Date:
Subject: Re: Parallel Inserts in CREATE TABLE AS