Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: Foreign join pushdown vs EvalPlanQual |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F80110F7CF@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: Foreign join pushdown vs EvalPlanQual (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: Foreign join pushdown vs EvalPlanQual
|
List | pgsql-hackers |
Hi Fujita-san, Sorry for my late. > On 2015/06/27 21:09, Kouhei Kaigai wrote: > >>> BTW, if you try newer version of postgres_fdw foreign join patch, > >>> please provide me to reproduce the problem/ > > >> OK > > > Did you forget to attach the patch, or v17 is in use? > > Sorry, I made a mistake. The problem was produced using v16 [1]. > > >>> I'd like to suggest a solution that re-construct remote tuple according > >>> to the fdw_scan_tlist on ExecScanFetch, if given scanrelid == 0. > >>> It enables to run local qualifier associated with the ForeignScan node, > >>> and it will also work for the case when tuple in es_epqTupleSet[] was > >>> local heap. > > >> Maybe I'm missing something, but I don't think your proposal works > >> properly because we don't have any component ForeignScan state node or > >> subsidiary join state node once we've replaced the entire join with the > >> ForeignScan performing the join remotely, IIUC. So, my image was to > >> have another subplan for EvalPlanQual as well as the ForeignScan, to do > >> the entire join locally for the component test tuples if we are inside > >> an EvalPlanQual recheck. > > > Hmm... Probably, we have two standpoints to tackle the problem. > > > > The first standpoint tries to handle the base foreign table as > > a prime relation for locking. Thus, we have to provide a way to > > fetch a remote tuple identified with the supplied ctid. > > The advantage of this approach is the way to fetch tuples from > > base relation is quite similar to the existing form, however, > > its disadvantage is another side of the same coin, because the > > ForeignScan node with scanrelid==0 (that represents remote join > > query) may have local qualifiers which shall run on the tuple > > according to fdw_scan_tlist. > > IIUC, I think this approach would also need to evaluate join conditions > and remote qualifiers in addition to local qualifiers in the local, for > component tuples that were re-fetched from the remote (and remaining > component tuples that were copied from whole-row vars, if any), in cases > where the re-fetched tuples were updated versions of those tuples rather > than the same versions priviously obtained. > > > One other standpoint tries to handle a bunch of base foreign > > tables as a unit. That means, if any of base foreign table is > > the target of locking, it prompts FDW driver to fetch the latest > > "joined" tuple identified by "ctid", even if this join contains > > multiple base relations to be locked. > > The advantage of this approach is that we can use qualifiers of > > the ForeignScan node with scanrelid==0 and no need to pay attention > > of remote qualifier and/or join condition individually. > > Its disadvantage is, we may extend EState structure to keep the > > "joined" tuples, in addition to es_epqTupleSet[]. > > That is an idea. However, ISTM there is another disadvantage; that is > not efficient because that would need to perform another remote join > query having such additional conditions during an EvalPlanQual check, as > you proposed. > > > I'm inclined to think the later standpoint works well, because > > it does not need to reproduce an alternative execution path in > > local side again, even if a ForeignScan node represents much > > complicated remote query. > > If we would fetch tuples of individual base relations, we need > > to reconstruct identical join path to be executed on remote- > > side, don't it? > > Yeah, that was my image for fixing this issue. > > > IIUC, the purpose of EvalPlanQual() is to ensure the tuples to > > be locked is still visible, so it is not an essential condition > > to fetch base tuples individually. > > I think so too, but taking the similarity and/or efficiency of > processing into consideration, I would vote for the idea of having an > alternative execution path in the local. That would also allow FDW > authors to write the foreign join pushdown functionality in their FDWs > by smaller efforts. > Even though I'd like to see committer's opinion, I could not come up with the idea better than what you proposed; foreign-/custom-scan has alternative plan if scanrelid==0. Let me introduce a few cases we should pay attention. Foreign/CustomScan node may stack; that means a Foreign/CustomScan node may have child node that includes another Foreign/CustomScan node with scanrelid==0. (At this moment, ForeignScan cannot have child node, however, more aggressive push-down [1] will need same feature to fetch tuples from local relation and construct VALUES() clause.) In this case, the highest Foreign/CustomScan node (that is also nearest to LockRows or ModifyTuples) run the alternative sub-plan that includes scan/join plans dominated by fdw_relids or custom_relids. For example: LockRows -> HashJoin -> CustomScan (AliceJoin) -> SeqScan on t1 -> CustomScan (CarolJoin) -> SeqScan on t2 -> SeqScan on t3 -> Hash -> CustomScan (BobJoin) -> SeqScan on t4 -> ForeignScan(remote join involves ft5, ft6) In this case, AliceJoin will have alternative sub-plan to join t1, t2 and t3, then it shall be used on EvalPlanQual(). Also, BobJoin will have alternative sub-plan to join t4, ft5 and ft6. CarolJoin and the ForeignScan will also have alternative sub-plan, however, these are not used in this case. Probably, it works fine. Do we have potential scenario if foreign-/custom-join is located over LockRows node. (Subquery expansion may give such a case?) Anyway, doesn't it make a problem, does it? On the next step, how do we implement this design? I guess that planner needs to keep a path that contains neither foreign-join nor custom-join with scanrelid==0. Probably, "cheapest_builtin_path" of RelOptInfo is needed that never contains these remote/custom join logic, as a seed of alternative sub-plan. create_foreignscan_plan() or create_customscan_plan() will be able to construct these alternative plan, regardless of the extensions. So, individual FDW/CSP don't need to care about this alternative sub-plan, do it? After that, once ExecScanFetch() is called under EvalPlanQual(), these Foreign/CustomScan with scanrelid==0 runs the alternative sub-plan, to validate the latest tuple. Hmm... It looks to me a workable approach. Fujita-san, are you available to make a patch with this approach? If so, I'd like to volunteer its reviewing. [1] http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F8010F20AD@BPXM15GP.gisp.nec.co.jp Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: