Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8011617C6@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> On Thu, Oct 29, 2015 at 6:05 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > In this case, the EPQ slot to store the joined tuple is still
> > a challenge to be solved.
> >
> > Is it possible to use one or any of EPQ slots that are setup for
> > base relations but represented by ForeignScan/CustomScan?
> 
> Yes, I proposed that exact thing upthread.
> 
> > In case when ForeignScan run a remote join that involves three
> > base foreign tables (relid=2, 3, 5 for example), for example,
> > no other code touches this slot. So, it is safe even if we put
> > a joined tuple on EPQ slots of underlying base relations.
> >
> > In this case, EPQ slots are initialized as below:
> >
> >   es_epqTuple[0] ... EPQ tuple of base relation (relid=1)
> >   es_epqTuple[1] ... EPQ of the joined tuple (for relis=2, 3 5)
> >   es_epqTuple[2] ... EPQ of the joined tuple (for relis=2, 3 5), copy of above
> >   es_epqTuple[3] ... EPQ tuple of base relation (relid=4)
> >   es_epqTuple[4] ... EPQ of the joined tuple (for relis=2, 3 5), copy of above
> >   es_epqTuple[5] ... EPQ tuple of base relation (relid=6)
> 
> You don't really need to initialize them all.  You can just initialize
> es_epqTuple[1] and leave 2 and 4 unused.
> 
> > Then, if FDW/CSP is designed to utilize the preliminary joined
> > tuples rather than local join, it can just raise the tuple kept
> > in one of the EPQ slots for underlying base relations.
> > If FDW/CSP prefers local join, it can perform as like local join
> > doing; check join condition and construct a joined tuple by itself
> > or by alternative plan.
> 
> Right.
>
A challenge is that junk wholerow references on behalf of ROW_MARK_COPY
are injected by preprocess_targetlist(). It is earlier than the main path
consideration by query_planner(), thus, it is not predictable how remote
query shall be executed at this point.
If ROW_MARK_COPY, base tuple image is fetched using this junk attribute.
So, here is two options if we allow to put joined tuple on either of
es_epqTuple[].

options-1) We ignore record type definition. FDW returns a joined tuple
towards the whole-row reference of either of the base relations in this
join. The junk attribute shall be filtered out eventually and only FDW
driver shall see, so it is harmless to do (probably).
This option takes no big changes, however, we need a little brave to adopt.

options-2) We allow FDW/CSP to adjust target-list of the relevant nodes
after these paths get chosen by planner. It enables to remove whole-row
reference of base relations and add alternative whole-row reference instead
if FDW/CSP can support it.
This feature can be relevant to target-list push-down to the remote side,
not only EPQ rechecks, because adjustment of target-list means we allows
FDW/CSP to determine which expression shall be executed locally, or shall
not be.
I think, this option is more straightforward, however, needs a little bit
deeper consideration, because we have to design the best hook point and
need to ensure how path-ification will perform.

Therefore, I think we need two steps towards the entire solution.
Step-1) FDW/CSP will recheck base EPQ tuples and support local
reconstruction on the fly. It does not need something special
enhancement on the planner - so we can fix up by v9.5 release.
Step-2) FDW/CSP will support adjustment of target-list to add whole-row
reference of joined tuple instead of multiple base relations, then FDW/CSP
will be able to put a joined tuple on either of EPQ slot if it wants - it
takes a new feature enhancement, so v9.6 is a suitable timeline.

How about your opinion towards the direction?
I don't want to drop extra optimization opportunity, however, we are now in
November. I don't have enough brave to add none-obvious new feature here.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: September 2015 Commitfest
Next
From: Michael Paquier
Date:
Subject: Re: fortnight interval support