On Wed, Jul 08, 2009 at 10:28:02AM -0500, Robert Haas wrote:
> On Jul 8, 2009, at 8:43 AM, Noah Misch <noah@leadboat.com> wrote:
>> The expontential factor seems smaller for real queries. I have a query of
>> sixteen joins that takes 71s to plan deterministically; it looks like this:
>>
>> SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
>> JOIN t t0 ON fact.key = t.key AND t.x = MCV0
>> LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
>> JOIN t t2 ON fact.key = t.key AND t.x = MCV2
>> LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
>> LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
>> LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
>> LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
>> LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4
>
> Very interesting! I am guessing that the problem here is that all the
> inner joins commute, as do all the left joins, so there are a lot of
> possibilities to explore. I also suspect that the actual join order
> doesn't matter much, so it's a good candidate for GEQO. Even if you had
> some restriction clauses against your dimension/t tables, that would
> probably just mean that you want to do those joins first, and the rest
> afterwards, which still seems like it ought to be OK for GEQO.
>
> But, in many queries this size, the join order is more constrained.
> Some of the later joins depend on the tables added by the earlier ones,
> rather than all referring back to the base table. Is there some way we
> could estimate the number of join offerings we'll have to consider
> relatively cheaply? That might be a better metric than # of tables.
Observing the pathological case taking plan time proportional to 4^N, apply
geqo_threshold as "use GEQO when comparing more than geqo_threshold * 4^N join
order possibilities"? I'm not sure whether that figure is available (early
enough) to factor into the decision. Such a change would probably imply a lower
default geqo_threshold, around 9 to 11. The number of typical queries subject
to GEQO would nonetheless decrease.
nm