Tom Lane wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
>> Tom Lane wrote:
>>> It evidently thinks that most of the rows in the join of part and
>>> partsupp won't have any matching rows in lineitem, whereas on average
>>> there are about 7 matching rows apiece. So that's totally wacko, and
>>> it's not immediately obvious why. Could we see the pg_stats entries for
>>> part.p_partkey, partsupp.ps_partkey, partsupp.ps_suppkey,
>>> lineitem.l_partkey, lineitem.l_suppkey?
>
>> http://www.kaltenbrunner.cc/files/dbt3_beta1_statistics.txt
>
> OK, so we have 2 million parts and 100000 suppliers, and ANALYZE doesn't
> seem to have been too far off at estimating either of those numbers.
> I think the problem is that there are not very many suppliers for any
> particular part, and thus the condition "part match AND supplier match"
> is really not much more selective than "part match" alone. The planner
> is supposing that their selectivities are independent, which they
> aren't.
looks like there are exactly 4 suppliers for any given part so that
seems indeed like the problem :-(
>
> Offhand I don't see any good way to fix this without multi-column
> statistics, which is something that's certainly not happening for 8.2 :-(
too bad - however any idea on one of the other troubling querys (q21) I
mentioned in the mail I resent to the list (after the original one got
lost)?
http://archives.postgresql.org/pgsql-hackers/2006-09/msg02011.php
Stefan