Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Parallel Inserts in CREATE TABLE AS |
Date | |
Msg-id | CALj2ACX+i-vtmdrS3kzLVJx=3TQLXryoF3zWWgZ-8Qvo3RJyFw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Inserts in CREATE TABLE AS (Zhihong Yu <zyu@yugabyte.com>) |
List | pgsql-hackers |
On Mon, Dec 28, 2020 at 1:16 AM Zhihong Yu <zyu@yugabyte.com> wrote: > For v16-0002-Tuple-Cost-Adjustment-for-Parallel-Inserts-in-CTAS.patch: > > + if (ignore && > + (root->parse->CTASParallelInsInfo & > + CTAS_PARALLEL_INS_TUP_COST_CAN_IGN)) > > I wonder why CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is checked again in the above if since when ignore_parallel_tuple_costreturns true, CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is set already. Sometimes, we may set the flag CTAS_PARALLEL_INS_TUP_COST_CAN_IGN before generate_useful_gather_paths, but the generate_useful_gather_paths can return without reaching cost_gather where we reset. The generate_useful_gather_paths can return without reaching cost_gather, in following case if (rel->partial_pathlist == NIL) return; So, for such cases, I'm resetting it here. > + * In this function we only care Append and Gather nodes. > > 'care' -> 'care about' Done. > + for (int i = 0; i < aps->as_nplans; i++) > + { > + parallel |= PushDownCTASParallelInsertState(dest, > + aps->appendplans[i], > + gather_exists); > > It seems the loop termination condition can include parallel since we can come out of the loop once parallel is true. No, we can not come out of the for loop if parallel is true, because our intention there is to look for all the child/sub plans under Append, and push the inserts to the Gather nodes wherever possible. > + if (!allow && tuple_cost_flags && gather_exists) > > As the above code shows, gather_exists is only checked when allow is false. Yes, if at least one gather node exists under the Append for which the planner would have ignored the tuple cost, and now if we don't allow parallel inserts, we should assert that the parallelism is not picked because of wrong parallel tuple cost enforcement. > + * We set the flag for two cases when there is no parent path will > + * be created(such as : limit,sort,distinct...): > > Please correct the grammar : there are two verbs following 'when' Done. > For set_append_rel_size: > > + { > + root->parse->CTASParallelInsInfo |= > + CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND; > + } > + } > + > + if (root->parse->CTASParallelInsInfo & > + CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND) > + { > + root->parse->CTASParallelInsInfo &= > + ~CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND; > > In the if block for childrel->rtekind == RTE_SUBQUERY, CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND maybe set. Why is it clearedimmediately after ? Thanks for pointing that out. It's a miss, intention is to reset it after set_rel_size(). Corrected in the v17 patch. > + /* Set to this in case tuple cost needs to be ignored for Append cases. */ > + CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND = 1 << 3 > > Since each CTAS_PARALLEL_INS_ flag is a bit, maybe it's better to use 'turn on' or similar term in the comment. Because'set to' normally means assignment. Done. All the above comments are addressed in the v17 patch set posted upthread. Please have a look. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: