Re: Add semi-join pushdown to postgres_fdw - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Add semi-join pushdown to postgres_fdw |
Date | |
Msg-id | CAExHW5u+wy7YAp61o-QJ=p8CMjZCZbcin=_xh7bYjeyJV1a1pg@mail.gmail.com Whole thread Raw |
In response to | Add semi-join pushdown to postgres_fdw (Alexander Pyhalov <a.pyhalov@postgrespro.ru>) |
Responses |
Re: Add semi-join pushdown to postgres_fdw
|
List | pgsql-hackers |
Hi Alexander, Thanks for working on this. It's great to see FDW join pushdown scope being expanded to more complex cases. I am still figuring out the implementation. It's been a while I have looked at join push down code. But following change strikes me odd -- subquery using immutable function (can be sent to remote) PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1; EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20); - QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ - Sort + QUERY PLAN +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Sort Key: t1.c1 - -> Nested Loop Semi Join - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Join Filter: (t1.c3 = t2.c3) - -> Foreign Scan on public.ft1 t1 - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20)) - -> Materialize - Output: t2.c3 - -> Foreign Scan on public.ft2 t2 - Output: t2.c3 - Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date)) -(14 rows) + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3)))) ORDER BY r1."C 1" ASC NULLS LAST +(4 rows) date_in | s | 1 | [0:0]={cstring} date_in which will be used to cast a test to date is not immutable. So the query should't be pushed down. May not be a problem with your patch. Can you please check? On Wed, Aug 24, 2022 at 12:55 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote: > > Hi. > > It's possible to extend deparsing in postgres_fdw, so that we can push > down semi-joins, which doesn't refer to inner reltarget. This allows > us to push down joins in queries like > > SELECT * FROM ft1 t1 WHERE t1.c1 < 10 AND t1.c3 IN (SELECT c3 FROM ft2 > t2 WHERE date(c5) = '1970-01-17'::date); > > > EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 < 10 AND > t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE date(c5) = '1970-01-17'::date); > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Foreign Scan > Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 > Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) > Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, > r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 10)) AND (EXISTS > (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((date(r3.c5) = > '1970-01-17'::date)) AND ((r1.c3 = r3.c3)))) > Thanks for working on this. It's great to see FDW join pushdown scope being expanded to more complex cases. I am still figuring out the implementation. It's been a while I have looked at join push down code. But following change strikes me odd -- subquery using immutable function (can be sent to remote) PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) = '1970-01-17'::date) ORDER BY c1; EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20); - QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ - Sort + QUERY PLAN +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + Foreign Scan Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Sort Key: t1.c1 - -> Nested Loop Semi Join - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Join Filter: (t1.c3 = t2.c3) - -> Foreign Scan on public.ft1 t1 - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20)) - -> Materialize - Output: t2.c3 - -> Foreign Scan on public.ft2 t2 - Output: t2.c3 - Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date)) -(14 rows) + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3)))) ORDER BY r1."C 1" ASC NULLS LAST +(4 rows) date_in | s | 1 | [0:0]={cstring} date_in which will be used to cast a test to date is not immutable. So the query should't be pushed down. May not be a problem with your patch. Can you please check? -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: