Re: GEQO vs join order restrictions - Mailing list pgsql-hackers

From Robert Haas
Subject Re: GEQO vs join order restrictions
Date
Msg-id 603c8f070907182128r5e98aca9g44ba02a8103bb9f3@mail.gmail.com
Whole thread Raw
In response to Re: GEQO vs join order restrictions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Jul 18, 2009 at 12:49 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> We could refrain from collapsing the sub-problem during joinlist
> formation.  But the trouble with that is it creates a "hard" join order
> restriction.  Most of the restrictions are "soft" to some extent, ie,
> you can do some rearrangements but not others.  It might be worth
> looking at though; in the cases where there is no chance of a
> rearrangement, it would save cycles for either regular or GEQO planning.

This seems very promising, although the proof of the pudding is in the
eating.  As a slight refinement of this idea, it's not exactly that
the join order is totally fixed, but rather that in a large join nest
there may be groups of tables that can be planned as independent
subproblems.  The obvious example of this is something like:

(...lots of stuff...) FULL JOIN (...lots of things...)

...where there isn't any point in considering any joins between stuff
and things, regardless of whether you are using GEQO or the standard
planner (and maybe we handle this case already?).   My brain is a
little foggy at the moment, but I think this would also apply to
Andres' example query, because I believe with anything of the form...

A LEFT JOIN (B1 INNER JOIN B2 INNER JOIN B3 ... INNER JOIN Bn) LEFT JOIN C

...the set of all Bi can be planned as an independent subproblem and
then joined to A either before or after C.  But (I think):

A LEFT JOIN (B1 INNER JOIN B2 LEFT JOIN B3) LEFT JOIN C
= A LEFT JOIN (B1 INNER JOIN B2) LEFT JOIN C LEFT JOIN B3

Here {B1, B2} is an independent subproblem, but {B1, B2, B3} is not.
Still, given that the planning time for the standard planner at least
can grow exponentially in the number of relations, even a small
reduction in the problem size is potentially a big win.

...Robert


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Using results from INSERT ... RETURNING
Next
From: Robert Haas
Date:
Subject: Re: Sampling profiler updated