Re: Oddity in EXPLAIN for foreign/custom join pushdown plans - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Oddity in EXPLAIN for foreign/custom join pushdown plans |
Date | |
Msg-id | edd1ee88-d7c9-583e-6c0a-bc7d50e40522@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Oddity in EXPLAIN for foreign/custom join pushdown plans (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: Oddity in EXPLAIN for foreign/custom join pushdown plans
Re: Oddity in EXPLAIN for foreign/custom join pushdown plans |
List | pgsql-hackers |
On 2016/07/29 13:05, Etsuro Fujita wrote: > In a foreign-join case, > however, we can't see such relations from the EXPLAIN printed *by core*. > postgres_fdw avoids this issue by adding such relations to the EXPLAIN > using ExplainForeignScan as shown in the below example, but since such > relations are essential, I think that information should be shown by > core itself. > > postgres=# explain select * from (select ft1.a from ft1 left join ft2 on > ft1.a = ft2.a where ft1.b = 1) ss1(a) full join (select ft3.a from ft3 > left join ft4 on ft3.a = ft4.a where ft3.b = 1) ss2(a) on ss1.a = ss2.a; > QUERY PLAN > ---------------------------------------------------------------- > Hash Full Join (cost=202.06..204.12 rows=1 width=8) > Hash Cond: (ft1.a = ft3.a) > -> Foreign Scan (cost=100.00..102.05 rows=1 width=4) > Relations: (public.ft1) LEFT JOIN (public.ft2) > -> Hash (cost=102.05..102.05 rows=1 width=4) > -> Foreign Scan (cost=100.00..102.05 rows=1 width=4) > Relations: (public.ft3) LEFT JOIN (public.ft4) > (7 rows) > > From the Relations line shown by postgres_fdw, we can see which foreign > join joins which foreign tables, but if no such lines, we couldn't. I thought about the Relations line a bit more and noticed that there are cases where the table reference names for joining relations in the Relations line are printed incorrectly. Here is an example: postgres=# explain verbose select * from (select t1.a, t2.a from ft1 t1, ft2 t2 where t1.a = t2.a union select t1.a, t2.a from ft1 t1, ft2 t2 where t1.a = t2.a) as t(t1a, t2a); QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Unique (cost=204.12..204.13 rows=2 width=8) Output: t1.a, t2.a -> Sort (cost=204.12..204.12 rows=2 width=8) Output: t1.a, t2.a Sort Key: t1.a, t2.a -> Append (cost=100.00..204.11 rows=2 width=8) -> Foreign Scan (cost=100.00..102.04 rows=1 width=8) Output: t1.a, t2.a Relations:(public.ft1 t1) INNER JOIN (public.ft2 t2) Remote SQL: SELECT r1.a, r2.a FROM (public.t1 r1 INNER JOIN public.t2 r2 ON (((r1.a = r2.a)))) -> Foreign Scan (cost=100.00..102.04 rows=1 width=8) Output: t1_1.a, t2_1.a Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) Remote SQL: SELECT r1.a, r2.a FROM (public.t1 r1 INNER JOIN public.t2 r2 ON (((r1.a = r2.a)))) (14 rows) The table reference names for ft1 and ft2 in the Relations line for the second Foreign Scan should be t1_1 and t2_1 respectively. Another concern about the Relations line is, that represents just an internal representation of a pushed-down join, so that would not necessarily match a deparsed query shown in the Remote SQL line. Here is an example, which I found when working on supporting pushing down full outer join a lot more, by improving the deparsing logic so that postgres_fdw can build a remote query that involves subqueries [1], which I'll work on for 10.0: + -- full outer join with restrictions on the joining relations + EXPLAIN (COSTS false, VERBOSE) + SELECT t1.c1, t2.c1 FROM (SELECT c1 FROM ft4 WHERE c1 BETWEEN 50 AND 60) t1 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 BETWEEN 50 AND 60) t2 ON (t1.c1 = t2.c1) ORDER BY t1.c1, t2.c1; + QUERY PLAN + --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan + Output: ft4.c1, ft5.c1 + Relations: (public.ft4) FULL JOIN (public.ft5) + Remote SQL: SELECT ss1.c1, ss2.c1 FROM ((SELECT c1 FROM "S 1"."T 3" WHERE ((c1 >= 50)) AND ((c1 <= 60))) ss1(c1) FULL JOIN (SELECT c1 FROM "S 1"."T 4" WHERE ((c1 >= 50)) AND ((c1 <= 60))) ss2(c1) ON (((ss1.c1 = ss2.c1)))) ORDER BY ss1.c1 ASC NULLS LAST, ss2.c1 ASC NULLS LAST + (4 rows) "(public.ft4) FULL JOIN (public.ft5)" in the Relations line does not exactly match the deparsed query in the Remote SQL line, which I think would be rather confusing for users. (We may be able to print more exact information in the Relations line so as to match the depaserd query, but I think that that would make the Relations line redundant.) Would we really need the Relations line? If joining relations are printed by core like "Foreign Join on public.ft1 t1_1, public.ft2 t2_1" as proposed upthread, we can see those relations from that, not the Relations line. Also we can see the join tree structure from the deparsed query in the Remote SQL line. The Relations line seems to be not that useful anymore, then. What do you think about that? Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5710D7E2.7010302%40lab.ntt.co.jp
pgsql-hackers by date: