Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAJcOf-f2N3+hf7=L0qxnM9hhJK=MQ7ZoZ24NoZfgOU+SBZXuOw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)
List pgsql-hackers
On Fri, Oct 9, 2020 at 6:31 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> A couple more observations:
>
> +       pathnode->path.parallel_aware = parallel_workers > 0 ? true : false;
>
> Hmm, I think this may be bogus window dressing only affecting EXPLAIN.
> If you change it to assign false always, it works just the same,
> except EXPLAIN says:
>
>  Gather  (cost=15428.00..16101.14 rows=1000000 width=4)
>    Workers Planned: 2
>    ->  Insert on s  (cost=15428.00..16101.14 rows=208334 width=4)
>          ->  Parallel Hash Join  (cost=15428.00..32202.28 rows=416667 width=4)
>
> ... instead of:
>
>  Gather  (cost=15428.00..16101.14 rows=1000000 width=4)
>    Workers Planned: 2
>    ->  Parallel Insert on s  (cost=15428.00..16101.14 rows=208334 width=4)
>          ->  Parallel Hash Join  (cost=15428.00..32202.28 rows=416667 width=4)
>
> AFAICS it's not parallel-aware, it just happens to be running in
> parallel with a partial input and partial output (and in this case,
> effect in terms of writes).  Parallel-aware is our term for nodes that
> actually know they are running in parallel and do some special
> coordination with their twins in other processes.
>

Ah, thanks, I see the distinction now. I'll fix that, to restore
parallel_aware=false for the ModifyTable node.

> The estimated row count also looks wrong; at a guess, the parallel
> divisor is applied twice.  Let me try that with
> parallel_leader_particiation=off (which disables some funky maths in
> the row estimation and makes it straight division by number of
> processes):
>
>  Gather  (cost=17629.00..18645.50 rows=1000000 width=4)
>    Workers Planned: 2
>    ->  Insert on s  (cost=17629.00..18645.50 rows=250000 width=4)
>          ->  Parallel Hash Join  (cost=17629.00..37291.00 rows=500000 width=4)
> [more nodes omitted]
>
> Yeah, that was a join that spat out a million rows, and we correctly
> estimated 500k per process, and then Insert (still with my hack to
> turn off the bogus "Parallel" display in this case, but it doesn't
> affect the estimation) estimated 250k per process, which is wrong.

Thanks, I did suspect the current costing was wrong for ModifyTable
(workers>0 case), as I'd thrown it in (moving current costing code
into costsize.c) without a lot of checking or great thought, and was
on my TODO list of things to check. At least I created a placeholder
for it. Looks like I've applied a parallel-divisor again (not allowing
for that of the underlying query), as you said.
Speaking of costing, I'm not sure I really agree with the current
costing of a Gather node. Just considering a simple Parallel SeqScan
case, the "run_cost += parallel_tuple_cost * path->path.rows;" part of
Gather cost always completely drowns out any other path costs when a
large number of rows are involved (at least with default
parallel-related GUC values), such that Parallel SeqScan would never
be the cheapest path. This linear relationship in the costing based on
the rows and a parallel_tuple_cost doesn't make sense to me. Surely
after a certain amount of rows, the overhead of launching workers will
be out-weighed by the benefit of their parallel work, such that the
more rows, the more likely a Parallel SeqScan will benefit. That seems
to suggest something like a logarithmic formula (or similar) would
better match reality than what we have now. Am I wrong on this? Every
time I use default GUC values, the planner doesn't want to generate a
parallel plan. Lowering parallel-related GUCs like parallel_tuple_cost
(which I normally do for testing) influences it of course, but the
linear relationship still seems wrong.

Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables
Next
From: Amit Kapila
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)