Re: Add semi-join pushdown to postgres_fdw - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Add semi-join pushdown to postgres_fdw
Date
Msg-id CAExHW5u+wy7YAp61o-QJ=p8CMjZCZbcin=_xh7bYjeyJV1a1pg@mail.gmail.com
Whole thread Raw
In response to Add semi-join pushdown to postgres_fdw  (Alexander Pyhalov <a.pyhalov@postgrespro.ru>)
Responses Re: Add semi-join pushdown to postgres_fdw
List pgsql-hackers
Hi Alexander,
Thanks for working on this. It's great to see FDW join pushdown scope
being expanded to more complex cases.

I am still figuring out the implementation. It's been a while I have
looked at join push down code.

But following change strikes me odd
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3
IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) =
'1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
- Sort
+

QUERY PLAN

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Nested Loop Semi Join
-         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-         Join Filter: (t1.c3 = t2.c3)
-         ->  Foreign Scan on public.ft1 t1
-               Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
-         ->  Materialize
-               Output: t2.c3
-               ->  Foreign Scan on public.ft2 t2
-                     Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE
(("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
-(14 rows)
+   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3)))) ORDER BY
r1."C 1" ASC NULLS LAST
+(4 rows)

date_in              | s           |        1 | [0:0]={cstring}
date_in which will be used to cast a test to date is not immutable. So
the query should't be pushed down. May not be a problem with your
patch. Can you please check?

On Wed, Aug 24, 2022 at 12:55 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
>
> Hi.
>
> It's possible to extend deparsing in postgres_fdw, so that we can push
> down semi-joins, which doesn't refer to inner reltarget. This allows
> us to push down joins in queries like
>
> SELECT * FROM ft1 t1 WHERE t1.c1 < 10 AND t1.c3 IN (SELECT c3 FROM ft2
> t2 WHERE date(c5) = '1970-01-17'::date);
>
>
> EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 < 10 AND
> t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE date(c5) = '1970-01-17'::date);
>
>                                          QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Foreign Scan
>     Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>     Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
>     Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
> r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 10)) AND (EXISTS
> (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((date(r3.c5) =
> '1970-01-17'::date)) AND ((r1.c3 = r3.c3))))
>

Thanks for working on this. It's great to see FDW join pushdown scope
being expanded to more complex cases.

I am still figuring out the implementation. It's been a while I have
looked at join push down code.

But following change strikes me odd
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3
IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) =
'1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
- Sort
+

QUERY PLAN

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Nested Loop Semi Join
-         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-         Join Filter: (t1.c3 = t2.c3)
-         ->  Foreign Scan on public.ft1 t1
-               Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
-         ->  Materialize
-               Output: t2.c3
-               ->  Foreign Scan on public.ft2 t2
-                     Output: t2.c3
-                     Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE
(("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
-(14 rows)
+   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3)))) ORDER BY
r1."C 1" ASC NULLS LAST
+(4 rows)

date_in              | s           |        1 | [0:0]={cstring}
date_in which will be used to cast a test to date is not immutable. So
the query should't be pushed down. May not be a problem with your
patch. Can you please check?




--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types
Next
From: Robert Haas
Date:
Subject: Re: replacing role-level NOINHERIT with a grant-level option