Re: TPC-H Q20 from 1 hour to 19 hours! - Mailing list pgsql-hackers

From Robert Haas
Subject Re: TPC-H Q20 from 1 hour to 19 hours!
Date
Msg-id CA+Tgmobqu8zz+ae0JSrT=bWsLdDkZqGFo=qbrvtM7t7rVTU9yA@mail.gmail.com
Whole thread Raw
In response to Re: TPC-H Q20 from 1 hour to 19 hours!  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: TPC-H Q20 from 1 hour to 19 hours!  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Mar 29, 2017 at 8:00 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> What is however strange is that changing max_parallel_workers_per_gather
> affects row estimates *above* the Gather node. That seems a bit, um,
> suspicious, no? See the parallel-estimates.log.

Thanks for looking at this!  Comparing the parallel plan vs. the
non-parallel plan:

part: parallel rows (after Gather) 20202, non-parallel rows 20202
partsupp: parallel rows 18, non-parallel rows 18
part-partsupp join: parallel rows 88988, non-parallel rows 355951
lineitem: parallel rows 59986112, non-parallel rows 59986112
lineitem after aggregation: parallel rows 5998611, non-parallel rows 5998611
final join: parallel rows 131, non-parallel rows 524

I agree with you that that looks mighty suspicious.  Both the
part-partsupp join and the final join have exactly 4x as many
estimated rows in the non-parallel plan as in the parallel plan, and
it just so happens that the parallel divisor here will be 4.

Hmm... it looks like the parallel_workers value from the Gather node
is being erroneously propagated up to the higher levels of the plan
tree.   Wow.   Somehow, Gather Merge managed to get the logic correct
here, but Gather is totally wrong.  Argh.   Attached is a draft patch,
which I haven't really tested beyond checking that it passes 'make
check'.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Vitaly Burovoy
Date:
Subject: Re: sequence data type
Next
From: "Ideriha, Takeshi"
Date:
Subject: Re: [WIP] RE: DECLARE STATEMENT setting up a connectionin ECPG