Re: -HEAD planner issue wrt hash_joins on dbt3 ? - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: -HEAD planner issue wrt hash_joins on dbt3 ?
Date
Msg-id 4518C375.1090003@kaltenbrunner.cc
Whole thread Raw
In response to Re: -HEAD planner issue wrt hash_joins on dbt3 ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: -HEAD planner issue wrt hash_joins on dbt3 ?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
>> http://www.kaltenbrunner.cc/files/analyze_q9_beta1.txt
> 
> The next problem seems to be the drastic misestimation of this join
> size:
> 
>     ->  Nested Loop  (cost=0.00..6872092.36 rows=135 width=28) (actual time=94.762..14429291.129 rows=3554044
loops=1)
>           ->  Merge Join  (cost=0.00..519542.74 rows=449804 width=16) (actual time=48.197..49636.006 rows=474008
loops=1)
>                 Merge Cond: (part.p_partkey = partsupp.ps_partkey)
>                 ->  Index Scan using pk_part on part  (cost=0.00..105830.22 rows=112447 width=4) (actual
time=34.646..14381.644rows=118502 loops=1)
 
>                       Filter: ((p_name)::text ~~ '%ghost%'::text)
>                 ->  Index Scan using i_ps_partkey on partsupp  (cost=0.00..388943.05 rows=8000278 width=12) (actual
time=13.511..22659.364rows=7999685 loops=1)
 
>           ->  Index Scan using i_l_suppkey_partkey on lineitem  (cost=0.00..14.11 rows=1 width=24) (actual
time=4.415..30.310rows=7 loops=474008)
 
>                 Index Cond: ((partsupp.ps_partkey = lineitem.l_partkey) AND (partsupp.ps_suppkey =
lineitem.l_suppkey))
> 
> With a factor-of-25000 error in that rowcount estimate, it's amazing the
> plans aren't worse than they are.
> 
> It evidently thinks that most of the rows in the join of part and
> partsupp won't have any matching rows in lineitem, whereas on average
> there are about 7 matching rows apiece.  So that's totally wacko, and
> it's not immediately obvious why.  Could we see the pg_stats entries for
> part.p_partkey, partsupp.ps_partkey, partsupp.ps_suppkey,
> lineitem.l_partkey, lineitem.l_suppkey?


http://www.kaltenbrunner.cc/files/dbt3_beta1_statistics.txt


Stefan


pgsql-hackers by date:

Previous
From: ITAGAKI Takahiro
Date:
Subject: Re: Questions about guc units
Next
From: "Dave Page"
Date:
Subject: Re: Buildfarm alarms