Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Join push-down support for foreign tables
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F801091BEE@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Join push-down support for foreign tables  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
List pgsql-hackers
Hanada-san,

One other question from my side:
How postgres_fdw tries to solve the varno/varattno mapping when it
replaces relations join by foreign-scan?

Let me explain the issue using an example. If SELECT has a target-
list that references 2nd-column of the inner relation and 2nd-column
of the outer relation, how varno/varattno of ForeignScan shall be
assigned on?
Unless FDW driver does not set fdw_ps_tlist, setrefs.c deals with
this ForeignScan as usual relation scan, then varno of Var will
have non-special varno (even though it may be shifted by rtoffset
in setrefs.c).
Then, ExecEvalScalarVar() is invoked on executor to evaluate the
value of fetched tuple. At that time, which slot and attribute will
be referenced? The varattno of Var-node is neutral on setrefs.c, so
both of the var-nodes that references 2nd-column of the inner relation
and 2nd-column of the outer relation will reference the 2nd-column
of the econtext->ecxt_scantuple, however, it is uncertain which
column of foreign-table is mapped to 2nd-column of the ecxt_scantuple.
So, it needs to inform the planner which underlying column is
mapped to the pseudo scan tlist.

Another expression of what I'm saying is:
 SELECT   ft_1.second_col X,   --> varno=1 / varattno=2   ft_2.second_col Y    --> varno=2 / varattno=2 FROM   ft_1
NATURALJOIN ft_2;
 

When FROM-clause is replaced by ForeignScan, the relevant varattno
also needs to be updated, according to the underlying remote query.
If postgres_fdw runs the following remote query, X should have varattno=1
and Y should have varattno=2 on the pseudo scan tlist. remote: SELECT t_1.second_col, t_2.second_col           FROM t_1
NATURALJOIN t_2;
 

You can inform the planner this mapping using fdw_ps_tlist field of
ForeignScan, if FDW driver put a list of TargetEntry.
In above example, fdw_ps_tlist will have two elements and both of then
has Var-node of the underlying foreign tables.

The patch to replace join by foreign-/custom-scan adds a functionality
to fix-up varno/varattno in these cases.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Tuesday, December 16, 2014 9:01 AM
> To: Robert Haas; Shigeru Hanada
> Cc: PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> Hanada-san,
> 
> Thanks for proposing this great functionality people waited for.
> 
> > On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada
> > <shigeru.hanada@gmail.com>
> > wrote:
> > > I'm working on $SUBJECT and would like to get comments about the
> > > design.  Attached patch is for the design below.
> >
> > I'm glad you are working on this.
> >
> > > 1. Join source relations
> > > As described above, postgres_fdw (and most of SQL-based FDWs) needs
> > > to check that 1) all foreign tables in the join belong to a server,
> > > and
> > > 2) all foreign tables have same checkAsUser.
> > > In addition to that, I add extra limitation that both inner/outer
> > > should be plain foreign tables, not a result of foreign join.  This
> > > limiation makes SQL generator simple.  Fundamentally it's possible
> > > to join even join relations, so N-way join is listed as enhancement
> > > item below.
> >
> > It seems pretty important to me that we have a way to push the entire
> > join nest down.  Being able to push down a 2-way join but not more
> > seems like quite a severe limitation.
> >
> As long as we don't care about simpleness/gracefulness of the remote query,
> what we need to do is not complicated. All the optimization jobs are
> responsibility of remote system.
> 
> Let me explain my thought:
> We have three cases to be considered; (1) a join between foreign tables
> that is the supported case, (2) a join either of relations is foreign join,
> and (3) a join both of relations are foreign joins.
> 
> In case of (1), remote query shall have the following form:
>   SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>
> 
> In case of (2) or (3), because we already construct inner/outer join, it
> is not difficult to replace the FT1 or FT2 above by sub-query, like:
>   SELECT <tlist> FROM FT3 JOIN
>     (SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>) as FJ1
>     ON <joincond> WHERE <qual>
> 
> How about your thought?
> 
> 
> Let me comment on some other points at this moment:
> 
> * Enhancement in src/include/foreign/fdwapi.h
> 
> It seems to me GetForeignJoinPath_function and
> GetForeignJoinPlan_function are not used anywhere. Is it an oversight to
> remove definitions from your previous work, isn't it?
> Now ForeignJoinPath is added on set_join_pathlist_hook, but not callback
> of FdwRoutine.
> 
> 
> * Is ForeignJoinPath really needed?
> 
> I guess the reason why you added ForeignJoinPath is, to have the fields
> of inner_path/outer_path. If we want to have paths of underlying relations,
> a straightforward way for the concept (join relations is replaced by
> foreign-/custom-scan on a result set of remote join) is enhancement of the
> fields in ForeignPath.
> How about an idea that adds "List *fdw_subpaths" to save the list of
> underlying Path nodes. It also allows to have more than two relations to
> be joined.
> (Probably, it should be a feature of interface portion. I may have to enhance
> my portion.)
> 
> * Why NestPath is re-defined?
> 
> -typedef JoinPath NestPath;
> +typedef struct NestPath
> +{
> +    JoinPath    jpath;
> +} NestPath;
> 
> It looks to me this change makes patch scale larger...
> 
> Best regards,
> --
> NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> <kaigai@ak.jp.nec.com>
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: [Bug] Inconsistent result for inheritance and FOR UPDATE.
Next
From: Andres Freund
Date:
Subject: Re: Commitfest problems