Home > mailing lists

Re: -HEAD planner issue wrt hash_joins on dbt3 ? - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: -HEAD planner issue wrt hash_joins on dbt3 ?
Date	September 26, 2006 18:53:48
Msg-id	570.1159307608@sss.pgh.pa.us Whole thread Raw
In response to	Re: -HEAD planner issue wrt hash_joins on dbt3 ? (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
Responses	Re: -HEAD planner issue wrt hash_joins on dbt3 ?
List	pgsql-hackers

Tree view

Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> Tom Lane wrote:
>> It evidently thinks that most of the rows in the join of part and
>> partsupp won't have any matching rows in lineitem, whereas on average
>> there are about 7 matching rows apiece.  So that's totally wacko, and
>> it's not immediately obvious why.  Could we see the pg_stats entries for
>> part.p_partkey, partsupp.ps_partkey, partsupp.ps_suppkey,
>> lineitem.l_partkey, lineitem.l_suppkey?

> http://www.kaltenbrunner.cc/files/dbt3_beta1_statistics.txt

OK, so we have 2 million parts and 100000 suppliers, and ANALYZE doesn't
seem to have been too far off at estimating either of those numbers.
I think the problem is that there are not very many suppliers for any
particular part, and thus the condition "part match AND supplier match"
is really not much more selective than "part match" alone.  The planner
is supposing that their selectivities are independent, which they
aren't.

Offhand I don't see any good way to fix this without multi-column
statistics, which is something that's certainly not happening for 8.2 :-(
        regards, tom lane

pgsql-hackers by date:

From: Alvaro Herrera
Date: 26 September 2006, 18:39:26
Subject: Re: Constant changes (Re-Build)

From: Tom Lane
Date: 26 September 2006, 19:27:19
Subject: Isn't strdup.h useless code?

Re: -HEAD planner issue wrt hash_joins on dbt3 ? - Mailing list pgsql-hackers

Previous

Next