Re: Removing INNER JOINs - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Removing INNER JOINs
Date
Msg-id 20141203162317.GA27550@alap3.anarazel.de
Whole thread Raw
In response to Re: Removing INNER JOINs  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Removing INNER JOINs  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2014-12-03 11:11:49 -0500, Robert Haas wrote:
> On Wed, Dec 3, 2014 at 10:56 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-12-03 10:51:19 -0500, Robert Haas wrote:
> >> On Wed, Dec 3, 2014 at 4:29 AM, David Rowley <dgrowleyml@gmail.com> wrote:
> >> > *** Method 1: Removing Inner Joins at planning time:
> >> >
> >> > *** Method 2: Marking scans as possibly skippable during planning, and
> >> > skipping joins at execution (Andres' method)
> >> >
> >> > *** Method 3: Marking scans as possibly skippable during planning and
> >> > removing redundant join nodes at executor startup (Simon's method)
> >> [....]
> >> > a. can we invoke the planner during executor init?
> >>
> >> I'm pretty sure that we can't safely invoke the planner during
> >> executor startup, and that doing surgery on the plan tree (option #3)
> >> is unsafe also.  I'm pretty clear why the latter is unsafe: it might
> >> be a copy of a data structure that's going to be reused.
> >
> > We already have a transformation between the plan and execution
> > tree.
> 
> We do?
> 
> I think what we have is a plan tree, which is potentially stored in a
> plan cache someplace and thus must be read-only, and a planstate tree,
> which contains the stuff that is for this specific execution.  There's
> probably some freedom to do exciting things in the planstate nodes,
> but I don't think you can tinker with the plan itself.

Well, the planstate tree is what determines the execution, right? I
don't see what would stop us from doing something like replacing:
PlanState *
ExecInitNode(Plan *node, EState *estate, int eflags)
{
...       case T_NestLoop:           result = (PlanState *) ExecInitNestLoop((NestLoop *) node,
                         estate, eflags);
 
by       case T_NestLoop:                       if (JoinCanBeSkipped(node))                           result =
NonSkippedJoinNode(node);                      else                           result = (PlanState *)
ExecInitNestLoop((NestLoop*) node,                                                   estate, eflags);
 

Where JoinCanBeSkipped() and NonSkippedJoinNode() contain the logic
from David's early patch where he put the logic entirely into the actual
execution phase.

We'd probably want to move the join nodes into a separate ExecInitJoin()
function and do the JoinCanBeSkipped() and NonSkippedJoin() node in the
generic code.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: On partitioning
Next
From: Stephen Frost
Date:
Subject: Re: Removing INNER JOINs