Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | Greg Nancarrow |
---|---|
Subject | Re: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | CAJcOf-f2N3+hf7=L0qxnM9hhJK=MQ7ZoZ24NoZfgOU+SBZXuOw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel INSERT (INTO ... SELECT ...) (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Parallel INSERT (INTO ... SELECT ...)
|
List | pgsql-hackers |
On Fri, Oct 9, 2020 at 6:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > A couple more observations: > > + pathnode->path.parallel_aware = parallel_workers > 0 ? true : false; > > Hmm, I think this may be bogus window dressing only affecting EXPLAIN. > If you change it to assign false always, it works just the same, > except EXPLAIN says: > > Gather (cost=15428.00..16101.14 rows=1000000 width=4) > Workers Planned: 2 > -> Insert on s (cost=15428.00..16101.14 rows=208334 width=4) > -> Parallel Hash Join (cost=15428.00..32202.28 rows=416667 width=4) > > ... instead of: > > Gather (cost=15428.00..16101.14 rows=1000000 width=4) > Workers Planned: 2 > -> Parallel Insert on s (cost=15428.00..16101.14 rows=208334 width=4) > -> Parallel Hash Join (cost=15428.00..32202.28 rows=416667 width=4) > > AFAICS it's not parallel-aware, it just happens to be running in > parallel with a partial input and partial output (and in this case, > effect in terms of writes). Parallel-aware is our term for nodes that > actually know they are running in parallel and do some special > coordination with their twins in other processes. > Ah, thanks, I see the distinction now. I'll fix that, to restore parallel_aware=false for the ModifyTable node. > The estimated row count also looks wrong; at a guess, the parallel > divisor is applied twice. Let me try that with > parallel_leader_particiation=off (which disables some funky maths in > the row estimation and makes it straight division by number of > processes): > > Gather (cost=17629.00..18645.50 rows=1000000 width=4) > Workers Planned: 2 > -> Insert on s (cost=17629.00..18645.50 rows=250000 width=4) > -> Parallel Hash Join (cost=17629.00..37291.00 rows=500000 width=4) > [more nodes omitted] > > Yeah, that was a join that spat out a million rows, and we correctly > estimated 500k per process, and then Insert (still with my hack to > turn off the bogus "Parallel" display in this case, but it doesn't > affect the estimation) estimated 250k per process, which is wrong. Thanks, I did suspect the current costing was wrong for ModifyTable (workers>0 case), as I'd thrown it in (moving current costing code into costsize.c) without a lot of checking or great thought, and was on my TODO list of things to check. At least I created a placeholder for it. Looks like I've applied a parallel-divisor again (not allowing for that of the underlying query), as you said. Speaking of costing, I'm not sure I really agree with the current costing of a Gather node. Just considering a simple Parallel SeqScan case, the "run_cost += parallel_tuple_cost * path->path.rows;" part of Gather cost always completely drowns out any other path costs when a large number of rows are involved (at least with default parallel-related GUC values), such that Parallel SeqScan would never be the cheapest path. This linear relationship in the costing based on the rows and a parallel_tuple_cost doesn't make sense to me. Surely after a certain amount of rows, the overhead of launching workers will be out-weighed by the benefit of their parallel work, such that the more rows, the more likely a Parallel SeqScan will benefit. That seems to suggest something like a logarithmic formula (or similar) would better match reality than what we have now. Am I wrong on this? Every time I use default GUC values, the planner doesn't want to generate a parallel plan. Lowering parallel-related GUCs like parallel_tuple_cost (which I normally do for testing) influences it of course, but the linear relationship still seems wrong. Regards, Greg Nancarrow Fujitsu Australia
pgsql-hackers by date: