Re: WIP: Join push-down for foreign tables - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WIP: Join push-down for foreign tables
Date
Msg-id 4ED8CC12.6000700@enterprisedb.com
Whole thread Raw
In response to Re: WIP: Join push-down for foreign tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: WIP: Join push-down for foreign tables  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 17.11.2011 17:24, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> When the FDW recognizes it's being asked to join a ForeignJoinPath and a
>> ForeignPath, or two ForeignJoinPaths, it throws away the old SQL it
>> constructed to do the two-way join, and builds a new one to join all
>> three tables.
>
> It should certainly not "throw away" the old SQL --- that path could
> still be chosen.

Right, that was loose phrasing from me.

>> That seems tedious, when there are a lot of tables
>> involved. A FDW like the pgsql_fdw that constructs an SQL query doesn't
>> need to consider pairs of joins. It could just as well build the SQL for
>> the three-way join directly. I think the API needs to reflect that.

Tom, what do you think of this part? I think it would be a lot more 
natural API if the planner could directly ask the FDW to construct a 
plan for a three (or more)-way join, instead of asking it to join a join 
relation into another relation.

The proposed API is this:

+ FdwPlan *
+ PlanForeignJoin (Oid serverid,
+                  PlannerInfo *root,
+                  RelOptInfo *joinrel,
+                  JoinType jointype,
+                  SpecialJoinInfo *sjinfo,
+                  Path *outer_path,
+                  Path *inner_path,
+                  List *restrict_clauses,
+                  List *pathkeys);

The problem I have with this is that the FDW shouldn't need outer_path 
and inner_path. All the information it needs is in 'joinrel'. Except for 
outer-joins, I guess; is there convenient way to get the join types 
involved in a join rel? It's there in SpecialJoinInfo, but if the FDW is 
only passed the RelOptInfo representing the three-way join, it's not there.

Does the planner expect the result from the foreign server to be 
correctly sorted, if it passes pathkeys to that function?

>> I wonder if we should have a heuristic to not even consider doing a join
>> locally, if it can be done remotely.
>
> I think this is a bad idea.  It will require major restructuring of the
> planner, and sometimes it will fail to find the best plan, in return for
> not much.  The nature of join planning is that we investigate a lot of
> dead ends.

Ok.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: Prep object creation hooks, and related sepgsql updates
Next
From: Peter Geoghegan
Date:
Subject: Re: Inlining comparators as a performance optimisation