Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: Foreign join pushdown vs EvalPlanQual |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F80113C052@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: Foreign join pushdown vs EvalPlanQual (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
> , On Wed, Aug 26, 2015 at 4:05 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > >> On 2015/08/26 16:07, Kouhei Kaigai wrote: > >> I wrote: > >> >> Maybe I'm missing something, but why do we need such a flexiblity for > >> >> the columnar-stores? > >> > >> > Even if we enforce them a new interface specification comfortable to RDBMS, > >> > we cannot guarantee it is also comfortable to other type of FDW drivers. > >> > >> Specifically, what kind of points about the patch are specific to RDBMS? > >> > > > > *** 88,93 **** ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot) > > --- 99,122 ---- > > TupleTableSlot * > > ExecForeignScan(ForeignScanState *node) > > { > > + EState *estate = node->ss.ps.state; > > + > > + if (estate->es_epqTuple != NULL) > > + { > > + /* > > + * We are inside an EvalPlanQual recheck. If foreign join, > get next > > + * tuple from subplan. > > + */ > > + Index scanrelid = ((Scan *) > node->ss.ps.plan)->scanrelid; > > + > > + if (scanrelid == 0) > > + { > > + PlanState *outerPlan = outerPlanState(node); > > + > > + return ExecProcNode(outerPlan); > > + } > > + } > > + > > return ExecScan((ScanState *) node, > > (ExecScanAccessMtd) ForeignNext, > > (ExecScanRecheckMtd) > ForeignRecheck); > > > > It might not be specific to RDBMS, however, we cannot guarantee all the FDW > are > > comfortable to run the alternative plan node on EPQ recheck. > > This design does not allow FDW drivers to implement own EPQ recheck, possibly > > more efficient than built-in logic. > > I'm not convinced that this problem is more than hypothetical. EPQ > rechecks should be quite rare, so it shouldn't really matter if we > jump through a few extra hoops when they happen. And really, are > those hoops all that expensive? It's not as if ExecInitNode should be > doing any sort of expensive operation, or ExecEndScan either. And > they will be able to tell if they're being called for an EPQ-recheck > by fishing out the estate, so if there's some processing that they > want to short-circuit for that case, they can. So I'm not seeing the > problem. Do you have any evidence that either the performance cost or > the code complexity cost is significant for PG-Strom or any other > extension? > Even though PG-Strom does not implement EPQ recheck mechanism yet (and not implemented on top of FDW), I plan to re-use CPU fallback mechanism (*1) rather than having alternative plan approach. I also don't care about performance penalty, however, don't want to have alternative plan because of code complexity. I don't deny individual extensions have alternative path by their decision, but should not be enforced. (*1) GPU often cannot execute expression because of exceptional data like very long numeric or external toast etc..., but to be executable. In this case, PG-Strom evaluates this expression in the CPU side (of course, it is worse than normal execution path but better than error). This logic is almost same as what we need on EPQ recheck. > I'm wondering if there's another approach. If I understand correctly, > there are two reasons why the current situation is untenable. The > first is that ForeignRecheck always returns true, but we could instead > call an FDW-supplied callback routine there. The callback could be > optional, so that we just return true if there is none, which is nice > for already-existing FDWs that then don't need to do anything. The > second is that ExecScanFetch needs scanrelid > 0 so that > estate->es_epqTupleSet[scanrelid - 1] isn't indexing off the beginning > of the array, and similarly estate->es_epqScanDone[scanrelid - 1] and > estate->es_epqTuple[scanrelid - 1]. But, waving my hands wildly, that > also seems like a solvable problem. I mean, we're joining a non-empty > set of relations, so the entries in the EPQ-related arrays for those > RTIs are not getting used for anything, so we can use any of them for > the joinrel. We need some way for this code to decide what RTI to > use, but that shouldn't be too hard to finagle. > ForeignScan->fs_relids and CustomScan->custom_relids know which RTIs shall be involved in this joinrel. However, only extension know how these relations (including the case of N-way join) shall be joined. FDW drivers may keep joinrestrictinfo in their comfortable way, like a compiled GPU native binary, so I don't think core side can do something relevant reasonably. Even though Fujita-san proposed a new special fields in ForeignScan to attach expression node that was pushed down, however, it looks to me interface contract makes more complicated. Rather than various special purpose fields, it is more straightforward to call back extension when scanrelid==0. We can provide equivalent feature as a utility function that has capability Fujita-san wants. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: