Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Join push-down support for foreign tables
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80109199D@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Join push-down support for foreign tables  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Join push-down support for foreign tables  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
List pgsql-hackers
Hanada-san,

Thanks for proposing this great functionality people waited for.

> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada <shigeru.hanada@gmail.com>
> wrote:
> > I'm working on $SUBJECT and would like to get comments about the
> > design.  Attached patch is for the design below.
> 
> I'm glad you are working on this.
> 
> > 1. Join source relations
> > As described above, postgres_fdw (and most of SQL-based FDWs) needs to
> > check that 1) all foreign tables in the join belong to a server, and
> > 2) all foreign tables have same checkAsUser.
> > In addition to that, I add extra limitation that both inner/outer
> > should be plain foreign tables, not a result of foreign join.  This
> > limiation makes SQL generator simple.  Fundamentally it's possible to
> > join even join relations, so N-way join is listed as enhancement item
> > below.
> 
> It seems pretty important to me that we have a way to push the entire join
> nest down.  Being able to push down a 2-way join but not more seems like
> quite a severe limitation.
> 
As long as we don't care about simpleness/gracefulness of the remote
query, what we need to do is not complicated. All the optimization jobs
are responsibility of remote system.

Let me explain my thought:
We have three cases to be considered; (1) a join between foreign tables
that is the supported case, (2) a join either of relations is foreign
join, and (3) a join both of relations are foreign joins.

In case of (1), remote query shall have the following form: SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>

In case of (2) or (3), because we already construct inner/outer join,
it is not difficult to replace the FT1 or FT2 above by sub-query, like: SELECT <tlist> FROM FT3 JOIN   (SELECT <tlist>
FROMFT1 JOIN FT2 ON <cond> WHERE <qual>) as FJ1   ON <joincond> WHERE <qual>
 

How about your thought?


Let me comment on some other points at this moment:

* Enhancement in src/include/foreign/fdwapi.h

It seems to me GetForeignJoinPath_function and GetForeignJoinPlan_function
are not used anywhere. Is it an oversight to remove definitions from your
previous work, isn't it?
Now ForeignJoinPath is added on set_join_pathlist_hook, but not callback of
FdwRoutine.


* Is ForeignJoinPath really needed?

I guess the reason why you added ForeignJoinPath is, to have the fields
of inner_path/outer_path. If we want to have paths of underlying relations,
a straightforward way for the concept (join relations is replaced by
foreign-/custom-scan on a result set of remote join) is enhancement of the
fields in ForeignPath.
How about an idea that adds "List *fdw_subpaths" to save the list of
underlying Path nodes. It also allows to have more than two relations
to be joined.
(Probably, it should be a feature of interface portion. I may have to
enhance my portion.)

* Why NestPath is re-defined?

-typedef JoinPath NestPath;
+typedef struct NestPath
+{
+    JoinPath    jpath;
+} NestPath;

It looks to me this change makes patch scale larger...

Best regards,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Status of CF 2014-10 and upcoming 2014-12
Next
From: "Amit Langote"
Date:
Subject: Re: On partitioning