Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 56679932.9000206@lab.ntt.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2015/12/09 1:13, Robert Haas wrote:
> On Tue, Dec 8, 2015 at 5:49 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> I'd like to discuss the next
>> thing about his patch.  As I mentioned in [1], the following change in
>> the patch will break the EXPLAIN output.
>>
>> @@ -205,6 +218,11 @@ ExecInitForeignScan(ForeignScan *node, EState
>> *estate, int eflags)
>>          scanstate->fdwroutine = fdwroutine;
>>          scanstate->fdw_state = NULL;
>>
>> +       /* Initialize any outer plan. */
>> +       if (outerPlanState(scanstate))
>> +               outerPlanState(scanstate) =
>> +                       ExecInitNode(outerPlan(node), estate, eflags);
>> +
>>
>> As pointed out by Horiguchi-san, that's not correct, though; we should
>> initialize the outer plan if outerPlan(node) != NULL, not
>> outerPlanState(scanstate) != NULL.  Attached is an updated version of
>> his patch.

>> I'm also attaching an updated version of the postgres_fdw
>> join pushdown patch.

> Is that based on Ashutosh's version of the patch, or are the two of
> you developing independent of each other?  We should avoid dueling
> patches if possible.

That's not based on his version.  I'll add to his patch changes I've 
made.  IIUC, his version is an updated version of Hanada-san's original 
patches that I've modified, so I guess that I could do that easily. 
(I've added a helper function for creating a local join execution plan 
for a given foreign join, but that is a rush work.  So, I'll rewrite that.)

>> You can find the breaking examples by doing the
>> regression tests in the postgres_fdw patch.  Please apply the patches in
>> the following order:
>>
>> epq-recheck-v6-efujita (attached)
>> usermapping_matching.patch in [2]
>> add_GetUserMappingById.patch in [2]
>> foreign_join_v16_efujita2.patch (attached)
>>
>> As I proposed upthread, I think we could fix that by handling the outer
>> plan as in the patch [3]; a) the core initializes the outer plan and
>> stores it into somewhere in the ForeignScanState node, not the lefttree
>> of the ForeignScanState node, during ExecInitForeignScan, and b) when
>> the RecheckForeignScan routine gets called, the FDW extracts the plan
>> from the given ForeignScanState node and executes it.  What do you think
>> about that?

> I think the actual regression test outputs are fine, and that your
> desire to suppress part of the plan tree from showing up in the
> EXPLAIN output is misguided.  I like it just the way it is.  To
> prevent user confusion, I think that when we add support to
> postgres_fdw for this we might also want to add some documentation
> explaining how to interpret this EXPLAIN output, but I don't think
> there's any problem with the output itself.

I'm not sure that that's a good idea.  one reason for that is I think 
that that would be more confusing to users when more than two foreign 
tables are involved in a foreign join as shown in the following example.  Note that the outer plans will be shown
recursively. Another reason 
 
is there is no consistency between the costs of the outer plans and that 
of the main plan.

postgres=# explain verbose select * from foo, bar, baz where foo.a = 
bar.a and bar.a = baz.a for update;
                                                             QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------
LockRows (cost=100.00..100.45 rows=15 width=96)   Output: foo.a, bar.a, baz.a, foo.*, bar.*, baz.*   ->  Foreign Scan
(cost=100.00..100.30rows=15 width=96)         Output: foo.a, bar.a, baz.a, foo.*, bar.*, baz.*         Relations:
((public.foo)INNER JOIN (public.bar)) INNER JOIN 
 
(public.baz)         Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1, r.a2 FROM 
(SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT l.a9, ROW(l.a9) FROM (SELECT 
a a9 FROM p
ublic.foo FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT r.a9, ROW(r.a9) 
FROM (SELECT a a9 FROM public.bar FOR UPDATE) r) r (a1, a2) ON ((l.a1 = 
r.a1))) l (a1, a2, a3, a4) INNER JOIN (SELECT r.a9, ROW(r.a9) FROM (SELECT a a9 
FROM public.baz FOR UPDATE) r) r (a1, a2) ON ((l.a1 = r.a1))         ->  Hash Join  (cost=272.13..272.69 rows=15
width=96)              Output: foo.a, foo.*, bar.a, bar.*, baz.a, baz.*               Hash Cond: (foo.a = baz.a)
      ->  Foreign Scan  (cost=100.00..100.04 rows=2 width=64)                     Output: foo.a, foo.*, bar.a, bar.*
                Relations: (public.foo) INNER JOIN (public.bar)                     Remote SQL: SELECT l.a1, l.a2,
r.a1,r.a2 FROM 
 
(SELECT l.a9, ROW(l.a9) FROM (SELECT a a9 FROM public.foo FOR UPDATE) l) 
l (a1, a2)
INNER JOIN (SELECT r.a9, ROW(r.a9) FROM (SELECT a a9 FROM public.bar FOR 
UPDATE) r) r (a1, a2) ON ((l.a1 = r.a1))                     ->  Nested Loop  (cost=200.00..202.18 rows=2 width=64)
                     Output: foo.a, foo.*, bar.a, bar.*                           Join Filter: (foo.a = bar.a)
                ->  Foreign Scan on public.foo 
 
(cost=100.00..101.06 rows=2 width=32)                                 Output: foo.a, foo.*
  Remote SQL: SELECT a FROM public.foo 
 
FOR UPDATE                           ->  Materialize  (cost=100.00..101.07 rows=2 
width=32)                                 Output: bar.a, bar.*                                 ->  Foreign Scan on
public.bar
 
(cost=100.00..101.06 rows=2 width=32)                                       Output: bar.a, bar.*
              Remote SQL: SELECT a FROM 
 
public.bar FOR UPDATE               ->  Hash  (cost=153.86..153.86 rows=1462 width=32)                     Output:
baz.a,baz.*                     ->  Foreign Scan on public.baz 
 
(cost=100.00..153.86 rows=1462 width=32)                           Output: baz.a, baz.*
RemoteSQL: SELECT a FROM public.baz FOR UPDATE
 
(29 rows)

Best regards,
Etsuro Fujita





pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Using quicksort for every external sort run
Next
From: Peter Geoghegan
Date:
Subject: Re: Using quicksort for every external sort run