Re: Push down more full joins in postgres_fdw - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Push down more full joins in postgres_fdw
Date
Msg-id c6a68ef1-3e97-807c-b381-087d109739eb@lab.ntt.co.jp
Whole thread Raw
In response to Re: Push down more full joins in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Push down more full joins in postgres_fdw  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On 2016/11/24 21:45, Etsuro Fujita wrote:
> On 2016/11/22 18:28, Ashutosh Bapat wrote:
>> The comments should explain why is the assertion true.
>> +        /* Shouldn't be NIL */
>> +        Assert(tlist != NIL);

> I noticed that I was wrong; in the Assertion the tlist can be empty.  An
> example for such a case is:
>
> SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL
> JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE);
>
> In this example any columns of the relations ft4 and ft5 wouldn't be
> needed for the join or the final output, so both the tlists for the
> relations created in deparseRangeTblRef would be empty.  Attached is an
> updated version fixing this bug.  Changes are:
>
> * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo
> and added a new member "is_subquery_rel" to fpinfo, as a flag on whether
> to deparse the relation as a subquery.
> * I modified deparseRangeTblRef, deparseSelectSql, and is_subquery_var
> to see the is_subquery_rel, not
> make_outerrel_subquery/make_innerrel_subquery.
> * I modified appendSubselectAlias to handle the case where ncols = 0
> properly.
> * I renamed subquery_rels in fpinfo to lower_subquery_rels, to make it
> easy to distinguish this from is_subquery_rel clearly.
> * I added regression tests for that.

I noticed that the above changes are not adequate; the updated patch
breaks EXPLAIN outputs for EvalPlanQual subplans of foreign joins in
which foreign tables are full joined and in the remote join query the
foreign tables are deparsed as subqueries.  Consider:

postgres=# explain verbose select * from test, ((select * from ft1 where
b between 1 and 3) t1 full join (select * from ft2 where b between 1 and
3) t2
  on (t1.a = t2.a)) ss for update of test;


QUERY PLA
N

-------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------------------
  LockRows  (cost=100.00..287.33 rows=6780 width=94)
    Output: test.a, test.b, ft1.a, ft1.b, ft2.a, ft2.b, test.ctid,
ft1.*, ft2.*
    ->  Nested Loop  (cost=100.00..219.53 rows=6780 width=94)
          Output: test.a, test.b, ft1.a, ft1.b, ft2.a, ft2.b, test.ctid,
ft1.*, ft2.*
          ->  Seq Scan on public.test  (cost=0.00..32.60 rows=2260 width=14)
                Output: test.a, test.b, test.ctid
          ->  Materialize  (cost=100.00..102.19 rows=3 width=80)
                Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.*
                ->  Foreign Scan  (cost=100.00..102.17 rows=3 width=80)
                      Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.*
                      Relations: (public.ft1) FULL JOIN (public.ft2)
                      Remote SQL: SELECT s5.c1, s5.c2, s5.c3, s6.c1,
s6.c2, s6.c3 FROM ((SELECT a, b, ROW(a, b) FROM public.t1 WHERE ((b >=
1)) AND ((b
<= 3))) s5(c1, c2, c3) FULL JOIN (SELECT a, b, ROW(a, b) FROM public.t2
WHERE ((b >= 1)) AND ((b <= 3))) s6(c1, c2, c3) ON (((s5.c1 = s6.c1))))
                      ->  Hash Full Join  (cost=201.14..202.29 rows=3
width=80)
                            Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.*
                            Hash Cond: (ft1.a = ft2.a)
                            ->  Foreign Scan on public.ft1
(cost=100.00..101.11 rows=3 width=40)
                                  Output: ft1.a, ft1.b, ft1.*
                                  Remote SQL: SELECT NULL FROM public.t1
WHERE ((b >= 1)) AND ((b <= 3))
                            ->  Hash  (cost=101.11..101.11 rows=3 width=40)
                                  Output: ft2.a, ft2.b, ft2.*
                                  ->  Foreign Scan on public.ft2
(cost=100.00..101.11 rows=3 width=40)
                                        Output: ft2.a, ft2.b, ft2.*
                                        Remote SQL: SELECT NULL FROM
public.t2 WHERE ((b >= 1)) AND ((b <= 3))
(23 rows)

The SELECT clauses of the remote queries for the Foreign Scans in the
EPQ subplan are deparsed incorrectly into "SELECT NULL".  The reason for
that is that since the foreign tables' fpinfo->is_subquery_rel has been
set true at the time those clauses are deparsed (due to that the foreign
tables have been deparsed as subqueries in the main remote join query),
deparseSelectSql calls deparseExplicitTargetList with an empty tlist, to
construct the clauses, so it deparses the tlist into "NULL".  This is
harmless because we don't use the remote queries for the Foreign Scans
during an EPQ recheck, but that would be confusing for users, so I
modified the patch a bit further to make the EXPLAIN output as-is.
Attached is a new version of the patch.

> I also revised code and comments a bit, just for consistency.

Also, I renamed appendSubselectAlias to appendSubqueryAlias, because we
use the term "subquery" for other places.

> I will update another patch for PHVs on top of the attached one.

Done.  I also revised some code and comments.  I'm also attaching the
updated version of the patch for PHVs.

Best regards,
Etsuro Fujita

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Radix tree for character conversion
Next
From: Masahiko Sawada
Date:
Subject: Re: Physical append-only tables