Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 56570248.7030306@lab.ntt.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2015/11/26 14:04, Kouhei Kaigai wrote:
>> On 2015/11/24 2:41, Robert Haas wrote:
>>> On Fri, Nov 20, 2015 at 12:11 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>>>> One subplan means FDW driver run an entire join sub-tree with local
>>>> alternative sub-plan; that is my expectation for the majority case.

>>> What I'm imagining is that we'd add handling that allows the
>>> ForeignScan to have inner and outer children.  If the FDW wants to
>>> delegate the EvalPlanQual handling to a local plan, it can use the
>>> outer child for that.  Or the inner one, if it likes.  The other one
>>> is available for some other purposes which we can't imagine yet.  If
>>> this is too weird, we can only add handling for an outer subplan and
>>> forget about having an inner subplan for now.  I just thought to make
>>> it symmetric, since outer and inner subplans are pretty deeply baked
>>> into the structure of the system.

>> I'd vote for only allowing an outer subplan.

> The attached patch adds: Path *fdw_outerpath field to ForeignPath node.
> FDW driver can set arbitrary but one path-node here.
> After that, this path-node shall be transformed to plan-node by
> createplan.c, then passed to FDW driver using GetForeignPlan callback.

I understand this, as I also did the same thing in my patches, but 
actually, that seems a bit complicated to me.  Instead, could we keep 
the fdw_outerpath in the fdw_private of a ForeignPath node when creating 
the path node during GetForeignPaths, and then create an outerplan 
accordingly from the fdw_outerpath stored into the fdw_private during 
GetForeignPlan, by using create_plan_recurse there?  I think that that 
would make the core involvment much simpler.

> We expect FDW driver set this plan-node on lefttree (a.k.a outerPlan).
> The Plan->outerPlan is a common field, so patch size become relatively
> small. FDW driver can initialize this plan at BeginForeignScan, then
> execute this sub-plan-tree on demand.

Another idea would be to add the core support for 
initializing/closing/rescanning the outerplan tree when the tree is given.

> Remaining portions are as previous version. ExecScanFetch is revised
> to call recheckMtd always when scanrelid==0, then FDW driver can get
> control using RecheckForeignScan callback.
> It allows FDW driver to handle (1) EPQ recheck on underlying scan nodes,
> (2) reconstruction of joined tuple, and (3) EPQ recheck on join clauses,
> by its preferable implementation - including execution of an alternative
> sub-plan.

@@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot 
*slot)
     ResetExprContext(econtext);

+    /*
+     * FDW driver has to recheck visibility of EPQ tuple towards
+     * the scan qualifiers once it gets pushed down.
+     * In addition, if this node represents a join sub-tree, not
+     * a scan, FDW driver is also responsible to reconstruct
+     * a joined tuple according to the primitive EPQ tuples.
+     */
+    if (fdwroutine->RecheckForeignScan)
+    {
+        if (!fdwroutine->RecheckForeignScan(node, slot))
+            return false;
+    }

Maybe I'm missing something, but I think we should let FDW do the work 
if scanrelid==0, not just if fdwroutine->RecheckForeignScan is given. 
(And if scanrelid==0 and fdwroutine->RecheckForeignScan is not given, we 
should abort the transaction.)

Another thing I'm concerned about is

@@ -347,8 +355,26 @@ ExecScanReScan(ScanState *node)     {         Index        scanrelid = ((Scan *)
node->ps.plan)->scanrelid;

-        Assert(scanrelid > 0);
+        if (scanrelid > 0)
+            estate->es_epqScanDone[scanrelid - 1] = false;
+        else
+        {
+            Bitmapset  *relids;
+            int            rtindex = -1;
+
+            if (IsA(node->ps.plan, ForeignScan))
+                relids = ((ForeignScan *) node->ps.plan)->fs_relids;
+            else if (IsA(node->ps.plan, CustomScan))
+                relids = ((CustomScan *) node->ps.plan)->custom_relids;
+            else
+                elog(ERROR, "unexpected scan node: %d",
+                     (int)nodeTag(node->ps.plan));

-        estate->es_epqScanDone[scanrelid - 1] = false;
+            while ((rtindex = bms_next_member(relids, rtindex)) >= 0)
+            {
+                Assert(rtindex > 0);
+                estate->es_epqScanDone[rtindex - 1] = false;
+            }
+        }     }

That seems the outerplan's business to me, so I think it'd be better to 
just return, right before the assertion, as I said before.  Seen from 
another angle, ISTM that FDWs that don't use a local join execution plan 
wouldn't need to be aware of handling the es_epqScanDone flags.  (Do you 
think that such FDWs should do something like what ExecScanFtch is doing 
about the flags, in their RecheckForeignScans?  If so, I think we need 
docs for that.)

>> There seems to be no changes to make_foreignscan.  Is that OK?

> create_foreignscan_path(), not only make_foreignscan().

OK

> This patch is not tested by actual FDW extensions, so it is helpful
> to enhance postgres_fdw to run the alternative sub-plan on EPQ recheck.

Will do.

Best regards,
Etsuro Fujita




pgsql-hackers by date:

Previous
From: Rushabh Lathia
Date:
Subject: Re: Getting sorted data from foreign server for merge join
Next
From: Magnus Hagander
Date:
Subject: Re: pg_stat_replication log positions vs base backups