Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80110F7CF@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Foreign join pushdown vs EvalPlanQual
List pgsql-hackers
Hi Fujita-san,

Sorry for my late.

> On 2015/06/27 21:09, Kouhei Kaigai wrote:
> >>> BTW, if you try newer version of postgres_fdw foreign join patch,
> >>> please provide me to reproduce the problem/
> 
> >> OK
> 
> > Did you forget to attach the patch, or v17 is in use?
> 
> Sorry, I made a mistake.  The problem was produced using v16 [1].
> 
> >>> I'd like to suggest a solution that re-construct remote tuple according
> >>> to the fdw_scan_tlist on ExecScanFetch, if given scanrelid == 0.
> >>> It enables to run local qualifier associated with the ForeignScan node,
> >>> and it will also work for the case when tuple in es_epqTupleSet[] was
> >>> local heap.
> 
> >> Maybe I'm missing something, but I don't think your proposal works
> >> properly because we don't have any component ForeignScan state node or
> >> subsidiary join state node once we've replaced the entire join with the
> >> ForeignScan performing the join remotely, IIUC.  So, my image was to
> >> have another subplan for EvalPlanQual as well as the ForeignScan, to do
> >> the entire join locally for the component test tuples if we are inside
> >> an EvalPlanQual recheck.
> 
> > Hmm... Probably, we have two standpoints to tackle the problem.
> >
> > The first standpoint tries to handle the base foreign table as
> > a prime relation for locking. Thus, we have to provide a way to
> > fetch a remote tuple identified with the supplied ctid.
> > The advantage of this approach is the way to fetch tuples from
> > base relation is quite similar to the existing form, however,
> > its disadvantage is another side of the same coin, because the
> > ForeignScan node with scanrelid==0 (that represents remote join
> > query) may have local qualifiers which shall run on the tuple
> > according to fdw_scan_tlist.
> 
> IIUC, I think this approach would also need to evaluate join conditions
> and remote qualifiers in addition to local qualifiers in the local, for
> component tuples that were re-fetched from the remote (and remaining
> component tuples that were copied from whole-row vars, if any), in cases
> where the re-fetched tuples were updated versions of those tuples rather
> than the same versions priviously obtained.
> 
> > One other standpoint tries to handle a bunch of base foreign
> > tables as a unit. That means, if any of base foreign table is
> > the target of locking, it prompts FDW driver to fetch the latest
> > "joined" tuple identified by "ctid", even if this join contains
> > multiple base relations to be locked.
> > The advantage of this approach is that we can use qualifiers of
> > the ForeignScan node with scanrelid==0 and no need to pay attention
> > of remote qualifier and/or join condition individually.
> > Its disadvantage is, we may extend EState structure to keep the
> > "joined" tuples, in addition to es_epqTupleSet[].
> 
> That is an idea.  However, ISTM there is another disadvantage; that is
> not efficient because that would need to perform another remote join
> query having such additional conditions during an EvalPlanQual check, as
> you proposed.
> 
> > I'm inclined to think the later standpoint works well, because
> > it does not need to reproduce an alternative execution path in
> > local side again, even if a ForeignScan node represents much
> > complicated remote query.
> > If we would fetch tuples of individual base relations, we need
> > to reconstruct identical join path to be executed on remote-
> > side, don't it?
> 
> Yeah, that was my image for fixing this issue.
> 
> > IIUC, the purpose of EvalPlanQual() is to ensure the tuples to
> > be locked is still visible, so it is not an essential condition
> > to fetch base tuples individually.
> 
> I think so too, but taking the similarity and/or efficiency of
> processing into consideration, I would vote for the idea of having an
> alternative execution path in the local.  That would also allow FDW
> authors to write the foreign join pushdown functionality in their FDWs
> by smaller efforts.
>
Even though I'd like to see committer's opinion, I could not come up
with the idea better than what you proposed; foreign-/custom-scan
has alternative plan if scanrelid==0.

Let me introduce a few cases we should pay attention.

Foreign/CustomScan node may stack; that means a Foreign/CustomScan node
may have child node that includes another Foreign/CustomScan node with
scanrelid==0.
(At this moment, ForeignScan cannot have child node, however, more
aggressive push-down [1] will need same feature to fetch tuples from
local relation and construct VALUES() clause.)
In this case, the highest Foreign/CustomScan node (that is also nearest
to LockRows or ModifyTuples) run the alternative sub-plan that includes
scan/join plans dominated by fdw_relids or custom_relids.

For example: LockRows  -> HashJoin    -> CustomScan (AliceJoin)      -> SeqScan on t1      -> CustomScan (CarolJoin)
   -> SeqScan on t2        -> SeqScan on t3    -> Hash      -> CustomScan (BobJoin)        -> SeqScan on t4        ->
ForeignScan(remote join involves ft5, ft6)
 

In this case, AliceJoin will have alternative sub-plan to join t1, t2
and t3, then it shall be used on EvalPlanQual(). Also, BobJoin will
have alternative sub-plan to join t4, ft5 and ft6. CarolJoin and the
ForeignScan will also have alternative sub-plan, however, these are
not used in this case.
Probably, it works fine.


Do we have potential scenario if foreign-/custom-join is located over
LockRows node. (Subquery expansion may give such a case?)
Anyway, doesn't it make a problem, does it?


On the next step, how do we implement this design?
I guess that planner needs to keep a path that contains neither
foreign-join nor custom-join with scanrelid==0.
Probably, "cheapest_builtin_path" of RelOptInfo is needed that
never contains these remote/custom join logic, as a seed of
alternative sub-plan.

create_foreignscan_plan() or create_customscan_plan() will be
able to construct these alternative plan, regardless of the
extensions. So, individual FDW/CSP don't need to care about
this alternative sub-plan, do it?

After that, once ExecScanFetch() is called under EvalPlanQual(),
these Foreign/CustomScan with scanrelid==0 runs the alternative
sub-plan, to validate the latest tuple.

Hmm... It looks to me a workable approach.

Fujita-san, are you available to make a patch with this approach?
If so, I'd like to volunteer its reviewing.

[1] http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F8010F20AD@BPXM15GP.gisp.nec.co.jp

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: psql tabcomplete - minor bugfix - tabcomplete for SET ROLE TO xxx
Next
From: Heikki Linnakangas
Date:
Subject: Re: Patch to improve a few appendStringInfo* calls