Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8011599C0@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> On Fri, Oct 16, 2015 at 5:00 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
> > As for #2, I updated the patch, which uses a local join execution plan for
> > an EvalPlanQual rechech, according to the comment from Robert [1]. Attached
> > is an updated version of the patch.  This is a WIP patch, but it would be
> > appreciated if I could get feedback earlier.
> 
> I don't see how this can be right.  You're basically just pretending
> EPQ doesn't exist in the remote join case, which isn't going to work
> at all.  Those bits of code that look at es_epqTuple, es_epqTupleSet,
> and es_epqScanDone are not optional.  You can't just skip over those
> as if they don't matter.
>
I think, it is right approach to pretend EPQ doesn't exist if scanrelid==0,
because what replaced by these ForeignScan/CustomScan node are local join
node like NestLoop. They don't have its own EPQ slot, but constructs joined-
tuple based on the underlying scan-tuple originally stored within EPQ slots.

> Again, the root of the problem here is that the EPQ machinery provides
> 1 slot per RTI, and it uses scanrelid to figure out which EPQ slot is
> applicable for a given scan node.  Here, we have scanrelid == 0, so it
> gets confused.  But it's not the case that a pushed-down join has NO
> scanrelid.  It actually has, in effect, *multiple* scanrelids.  So we
> should pick any one of those, say the lowest-numbered one, and use
> that to decide which EPQ slot to use.  The attached patch shows
> roughly what I have in mind, although this is just crap code to
> demonstrate the basic idea and doesn't pretend to adjust everything
> that needs fixing.
>
One tricky point of this idea is ExecStoreTuple() in ExecScanFetch(),
because the EPQ slot picked up by get_proxy_scanrelid() contains
a tuple of base relation then it tries to put this tuple on the
TupleTableSlot initialized to save the joined-tuple.
Of course, recheckMtd is called soon, so callback will be able to
handle the request correctly. However, it is a bit unnatural to store
a tuple on incompatible TupleTableSlot.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan
Next
From: Tom Lane
Date:
Subject: Re: Foreign join pushdown vs EvalPlanQual