Home > mailing lists

Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From	Luc Vlaming
Subject	Re: Parallel Inserts in CREATE TABLE AS
Date	January 12, 2021 11:30:26
Msg-id	af416701-d8b2-7a2e-22aa-a7b7f4179e68@swarm64.com Whole thread Raw
In response to	Re: Parallel Inserts in CREATE TABLE AS (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List	pgsql-hackers

Tree view

On 06-01-2021 09:32, Bharath Rupireddy wrote:
> On Tue, Jan 5, 2021 at 1:25 PM Luc Vlaming <luc@swarm64.com> wrote:
>>>>>> wrt v18-0002....patch:
>>>>>>
>>>>>> It looks like this introduces a state machine that goes like:
>>>>>> - starts at CTAS_PARALLEL_INS_UNDEF
>>>>>> - possibly moves to CTAS_PARALLEL_INS_SELECT
>>>>>> - CTAS_PARALLEL_INS_TUP_COST_CAN_IGN can be added
>>>>>> - if both were added at some stage, we can go to
>>>>>> CTAS_PARALLEL_INS_TUP_COST_IGNORED and ignore the costs
>>>>>>
>>>>>> what i'm wondering is why you opted to put logic around
>>>>>> generate_useful_gather_paths and in cost_gather when to me it seems more
>>>>>> logical to put it in create_gather_path? i'm probably missing something
>>>>>> there?
>>>>>
>>>>> IMO, The reason is we want to make sure we only ignore the cost when Gather is the top node.
>>>>> And it seems the generate_useful_gather_paths called in apply_scanjoin_target_to_paths is the right place which
canonly create top node Gather.
 
>>>>> So we change the flag in apply_scanjoin_target_to_paths around generate_useful_gather_paths to identify the top
node.
>>>
>>> Right. We wanted to ignore parallel tuple cost for only the upper Gather path.
>>>
>>>> I was wondering actually if we need the state machine. Reason is that as
>>>> AFAICS the code could be placed in create_gather_path, where you can
>>>> also check if it is a top gather node, whether the dest receiver is the
>>>> right type, etc? To me that seems like a nicer solution as its makes
>>>> that all logic that decides whether or not a parallel CTAS is valid is
>>>> in a single place instead of distributed over various places.
>>>
>>> IMO, we can't determine the fact that we are going to generate the top
>>> Gather path in create_gather_path. To decide on whether or not the top
>>> Gather path generation, I think it's not only required to check the
>>> root->query_level == 1 but we also need to rely on from where
>>> generate_useful_gather_paths gets called. For instance, for
>>> query_level 1, generate_useful_gather_paths gets called from 2 places
>>> in apply_scanjoin_target_to_paths. Likewise, create_gather_path also
>>> gets called from many places. IMO, the current way i.e. setting flag
>>> it in apply_scanjoin_target_to_paths and ignoring based on that in
>>> cost_gather seems safe.
>>>
>>> I may be wrong. Thoughts?
>>
>> So the way I understand it the requirements are:
>> - it needs to be the top-most gather
>> - it should not do anything with the rows after the gather node as this
>> would make the parallel inserts conceptually invalid.
> 
> Right.
> 
>> Right now we're trying to judge what might be added on-top that could
>> change the rows by inspecting all parts of the root object that would
>> cause anything to be added, and add a little statemachine to track the
>> state of that knowledge. To me this has the downside that the list in
>> HAS_PARENT_PATH_GENERATING_CLAUSE has to be exhaustive, and we need to
>> make sure it stays up-to-date, which could result in regressions if not
>> tracked carefully.
> 
> Right. Any new clause that will be added which generates an upper path
> in grouping_planner after apply_scanjoin_target_to_paths also needs to
> be added to HAS_PARENT_PATH_GENERATING_CLAUSE. Otherwise, we might
> ignore the parallel tuple cost because of which the parallel plan may
> be chosen and we go for parallel inserts only when the top node is
> Gather. I don't think any new clause that will be added generates a
> new upper Gather node in grouping_planner after
> apply_scanjoin_target_to_paths.
> 
>> Personally I would therefore go for a design which is safe in the sense
>> that regressions are not as easily introduced. IMHO that could be done
>> by inspecting the planned query afterwards, and then judging whether or
>> not the parallel inserts are actually the right thing to do.
> 
> The 0001 patch does that. It doesn't have any influence on the planner
> for parallel tuple cost calculation, it just looks at the generated
> plan and decides on parallel inserts. Having said that, we might miss
> parallel plans even though we know that there will not be tuples
> transferred from workers to Gather. So, 0002 patch adds the code for
> influencing the planner for parallel tuple cost.
> 

Ok. Thanks for the explanation and sorry for the confusion.

>> Another way to create more safety against regressions would be to add an
>> assert upon execution of the query that if we do parallel inserts that
>> only a subset of allowed nodes exists above the gather node.
> 
> Yes, we already do this. Please have a look at
> SetParallelInsertState() in the 0002 patch. The idea is that in any
> case, if the planner ignored the tuple cost, but we later not allow
> parallel inserts either due to the upper node is not Gather or Gather
> with projections. The assertion fails. So, in case any new parent path
> generating clause is added (apart from the ones that are there in
> HAS_PARENT_PATH_GENERATING_CLAUSE) and we ignore the tuple cost, then
> this Assert will catch it. Currently, I couldn't find any assertion
> failures in my debug build with make check and make check world.
> 

Ok. Seems I missed that assert when reviewing.

> +    else
> +    {
> +        /*
> +         * Upper Gather node has projections, so parallel insertions are not
> +         * allowed.
> +         */
> +        if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +            ((DR_intorel *) dest)->is_parallel = false;
> +
> +        gstate->dest = NULL;
> +
> +        /*
> +         * Before returning, ensure that we have not done wrong parallel tuple
> +         * cost enforcement in the planner. Main reason for this assertion is
> +         * to check if we enforced the planner to ignore the parallel tuple
> +         * cost (with the intention of choosing parallel inserts) due to which
> +         * the parallel plan may have been chosen, but we do not allow the
> +         * parallel inserts now.
> +         *
> +         * If we have correctly ignored parallel tuple cost in the planner
> +         * while creating Gather path, then this assertion failure should not
> +         * occur. In case it occurs, that means the planner may have chosen
> +         * this parallel plan because of our wrong enforcement. So let's try to
> +         * catch that here.
> +         */
> +        Assert(tuple_cost_opts && !(*tuple_cost_opts &
> +               PARALLEL_INSERT_TUP_COST_IGNORED));
> +    }
> 
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
> 

Kind regards,
Luc

pgsql-hackers by date:

From: Amit Kapila
Date: 12 January 2021, 11:13:32
Subject: Re: [Bug Fix] Logical replication on partition table is very slow and CPU is 99%

From: Bharath Rupireddy
Date: 12 January 2021, 12:33:44
Subject: Re: logical replication worker accesses catalogs in error context callback

Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

Previous

Next