Re: Removing INNER JOINs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Removing INNER JOINs
Date
Msg-id 18591.1417626481@sss.pgh.pa.us
Whole thread Raw
In response to Re: Removing INNER JOINs  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Removing INNER JOINs  (Robert Haas <robertmhaas@gmail.com>)
Re: Removing INNER JOINs  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Dec 3, 2014 at 11:23 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> Well, the planstate tree is what determines the execution, right? I
>> don't see what would stop us from doing something like replacing:
>> PlanState *
>> ExecInitNode(Plan *node, EState *estate, int eflags)
>> {
>> ...
>> case T_NestLoop:
>> result = (PlanState *) ExecInitNestLoop((NestLoop *) node,
>> estate, eflags);
>> by
>> case T_NestLoop:
>> if (JoinCanBeSkipped(node))
>> result = NonSkippedJoinNode(node);
>> else
>> result = (PlanState *) ExecInitNestLoop((NestLoop *) node,
>> estate, eflags);
>> 
>> Where JoinCanBeSkipped() and NonSkippedJoinNode() contain the logic
>> from David's early patch where he put the logic entirely into the actual
>> execution phase.

> Yeah, maybe.  I think there's sort of a coding principle that the plan
> and planstate trees should match up one-to-one, but it's possible that
> nothing breaks if they don't, or that I've misunderstood the coding
> rule in the first instance.

Far better would be what I mentioned upthread: an explicit switch node
in the plan tree, analogous to the existing AlternativeSubPlan structure.
ChooseJoinSubPlan  -> plan tree requiring all tables to be joined  -> plan tree not requiring all tables to be joined

This allows sensible display by EXPLAIN and avoids the need for the core
executor code to be dirtied with implementation of the precise switch
rule: all that logic goes into the ChooseJoinSubPlan plan node code.

I would envision the planner starting out generating the first subplan
(without the optimization), but as it goes along, noting whether there
are any opportunities for join removal.  At the end, if it found that
there were such opportunities, re-plan assuming that removal is possible.
Then stick a switch node on top.

This would give optimal plans for both cases, and it would avoid the need
for lots of extra planner cycles when the optimization can't be applied
... except for one small detail, which is that the planner has a bad habit
of scribbling on its own input.  I'm not sure how much cleanup work would
be needed before that "re-plan" operation could happen as easily as is
suggested above.  But in principle this could be made to work.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Removing INNER JOINs
Next
From: Robert Haas
Date:
Subject: Re: Removing INNER JOINs