postgres_fdw planning issue: EquivalenceClass changes confuse it - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | postgres_fdw planning issue: EquivalenceClass changes confuse it |
Date | |
Msg-id | 1691374.1671659838@sss.pgh.pa.us Whole thread Raw |
List | pgsql-hackers |
I discovered that if you do this: diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 2e6f7f4852..2ae231fd90 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -366,7 +366,7 @@ CREATE FUNCTION postgres_fdw_abs(int) RETURNS int AS $$ BEGIN RETURN abs($1); END -$$ LANGUAGE plpgsql IMMUTABLE; +$$ LANGUAGE plpgsql IMMUTABLE STRICT; CREATE OPERATOR === ( LEFTARG = int, RIGHTARG = int, one of the plan changes that you get (attached) is that this query: EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM local_tbl LEFT JOIN (SELECT ft1.* FROM ft1 INNER JOIN ft2 ON (ft1.c1 = ft2.c1 AND ft1.c1 < 100 AND ft1.c1 =postgres_fdw_abs(ft2.c2))) ss ON (local_tbl.c3 = ss.c3) ORDER BY local_tbl.c1 FOR UPDATE OF local_tbl; can no longer do the join as a foreign join. There are some other changes that are more explicable, because strictness of the function allows the planner to strength-reduce full joins to left joins, but what happened here? The answer is that once postgres_fdw_abs() is marked strict, the EquivalenceClass machinery will group these clauses as an EquivalenceClass: ft1.c1 = ft2.c1 AND ft1.c1 = postgres_fdw_abs(ft2.c2) which it will then choose to implement as a restriction clause on ft2 ft2.c1 = postgres_fdw_abs(ft2.c2) followed by a join clause ft1.c1 = ft2.c1 This is a good and useful transformation, because it can get rid of ft2 rows at the scan level instead of waiting for them to be joined. However, because we are treating postgres_fdw_abs() as non-shippable in this particular test case, that means that ft2 now has a non-shippable restriction clause, causing foreign_join_ok to give up here: /* * If joining relations have local conditions, those conditions are * required to be applied before joining the relations. Hence the join can * not be pushed down. */ if (fpinfo_o->local_conds || fpinfo_i->local_conds) return false; In the other formulation, "ft1.c1 = postgres_fdw_abs(ft2.c2)" is a non-shippable join clause, which foreign_join_ok knows how to cope with. So this seems like a fairly nasty asymmetry. I ran into this while experimenting with the next phase in my outer-join-vars patch set, in which the restriction that below-outer-join Equivalence classes contain only strict members will go away. So that breaks this test, and I need to either fix postgres_fdw or change the test case. I experimented with teaching foreign_join_ok to pull up the child rels' local_conds to be join local_conds if the join is an inner join, which seems like a legal transformation. I ran into a couple of issues though, the hardest of which to solve is that in DML queries we get "variable not found in subplan target lists" failures while trying to build some EPQ queries. That's because the pulled-up condition uses a variable that we didn't think we'd need at the join level. That could possibly be fixed by handling these conditions differently for the transmitted query than the EPQ query, but I'm not sufficiently familiar with the postgres_fdw code to want to take point on coding that. In any case, this line of thought would lead to several other plan changes in the postgres_fdw regression tests, and I'm not sure if any of those would be breaking the intent of the test cases. Or I could just hack this one test so that it continues to not be an EquivalenceClass case. Thoughts? regards, tom lane diff -U3 /home/postgres/pgsql/contrib/postgres_fdw/expected/postgres_fdw.out /home/postgres/pgsql/contrib/postgres_fdw/results/postgres_fdw.out --- /home/postgres/pgsql/contrib/postgres_fdw/expected/postgres_fdw.out 2022-12-21 16:28:13.601965240 -0500 +++ /home/postgres/pgsql/contrib/postgres_fdw/results/postgres_fdw.out 2022-12-21 16:30:20.152135336 -0500 @@ -923,7 +923,7 @@ BEGIN RETURN abs($1); END -$$ LANGUAGE plpgsql IMMUTABLE; +$$ LANGUAGE plpgsql IMMUTABLE STRICT; CREATE OPERATOR === ( LEFTARG = int, RIGHTARG = int, @@ -1828,28 +1828,44 @@ -- full outer join + WHERE clause with shippable extensions set EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.c1, t2.c2, t1.c3 FROM ft1 t1 FULL JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE postgres_fdw_abs(t1.c1) > 0 OFFSET 10 LIMIT10; - QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - Foreign Scan + QUERY PLAN +---------------------------------------------------------------------------------------------------------- + Limit Output: t1.c1, t2.c2, t1.c3 - Relations: (public.ft1 t1) FULL JOIN (public.ft2 t2) - Remote SQL: SELECT r1."C 1", r2.c2, r1.c3 FROM ("S 1"."T 1" r1 FULL JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1"))))WHERE ((public.postgres_fdw_abs(r1."C 1") > 0)) LIMIT 10::bigint OFFSET 10::bigint -(4 rows) + -> Hash Left Join + Output: t1.c1, t2.c2, t1.c3 + Hash Cond: (t1.c1 = t2.c1) + -> Foreign Scan on public.ft1 t1 + Output: t1.c1, t1.c3 + Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1" WHERE ((public.postgres_fdw_abs("C 1") > 0)) + -> Hash + Output: t2.c2, t2.c1 + -> Foreign Scan on public.ft2 t2 + Output: t2.c2, t2.c1 + Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" +(13 rows) ALTER SERVER loopback OPTIONS (DROP extensions); -- full outer join + WHERE clause with shippable extensions not set EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.c1, t2.c2, t1.c3 FROM ft1 t1 FULL JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE postgres_fdw_abs(t1.c1) > 0 OFFSET 10 LIMIT10; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------- Limit Output: t1.c1, t2.c2, t1.c3 - -> Foreign Scan + -> Hash Left Join Output: t1.c1, t2.c2, t1.c3 - Filter: (postgres_fdw_abs(t1.c1) > 0) - Relations: (public.ft1 t1) FULL JOIN (public.ft2 t2) - Remote SQL: SELECT r1."C 1", r2.c2, r1.c3 FROM ("S 1"."T 1" r1 FULL JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C1")))) -(7 rows) + Hash Cond: (t1.c1 = t2.c1) + -> Foreign Scan on public.ft1 t1 + Output: t1.c1, t1.c3 + Filter: (postgres_fdw_abs(t1.c1) > 0) + Remote SQL: SELECT "C 1", c3 FROM "S 1"."T 1" + -> Hash + Output: t2.c2, t2.c1 + -> Foreign Scan on public.ft2 t2 + Output: t2.c2, t2.c1 + Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" +(14 rows) ALTER SERVER loopback OPTIONS (ADD extensions 'postgres_fdw'); -- join two tables with FOR UPDATE clause @@ -2514,8 +2530,8 @@ ALTER SERVER loopback OPTIONS (ADD fdw_startup_cost '10000.0'); EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM local_tbl LEFT JOIN (SELECT ft1.* FROM ft1 INNER JOIN ft2 ON (ft1.c1 = ft2.c1 AND ft1.c1 < 100 AND ft1.c1= postgres_fdw_abs(ft2.c2))) ss ON (local_tbl.c3 = ss.c3) ORDER BY local_tbl.c1 FOR UPDATE OF local_tbl; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------------------------------- LockRows Output: local_tbl.c1, local_tbl.c2, local_tbl.c3, ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, local_tbl.ctid,ft1.*, ft2.* -> Nested Loop Left Join @@ -2525,30 +2541,19 @@ Output: local_tbl.c1, local_tbl.c2, local_tbl.c3, local_tbl.ctid -> Materialize Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*, ft2.* - -> Foreign Scan + -> Hash Join Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*, ft2.* - Filter: (ft1.c1 = postgres_fdw_abs(ft2.c2)) - Relations: (public.ft1) INNER JOIN (public.ft2) - Remote SQL: SELECT r4."C 1", r4.c2, r4.c3, r4.c4, r4.c5, r4.c6, r4.c7, r4.c8, CASE WHEN (r4.*)::textIS NOT NULL THEN ROW(r4."C 1", r4.c2, r4.c3, r4.c4, r4.c5, r4.c6, r4.c7, r4.c8) END, CASE WHEN (r5.*)::text ISNOT NULL THEN ROW(r5."C 1", r5.c2, r5.c3, r5.c4, r5.c5, r5.c6, r5.c7, r5.c8) END, r5.c2 FROM ("S 1"."T 1" r4 INNER JOIN"S 1"."T 1" r5 ON (((r5."C 1" = r4."C 1")) AND ((r4."C 1" < 100)))) ORDER BY r4.c3 ASC NULLS LAST - -> Sort - Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*, ft2.*, ft2.c2 - Sort Key: ft1.c3 - -> Merge Join - Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*, ft2.*, ft2.c2 - Merge Cond: ((ft1.c1 = (postgres_fdw_abs(ft2.c2))) AND (ft1.c1 = ft2.c1)) - -> Sort - Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.* - Sort Key: ft1.c1 - -> Foreign Scan on public.ft1 - Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.* - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE(("C 1" < 100)) - -> Sort - Output: ft2.*, ft2.c1, ft2.c2, (postgres_fdw_abs(ft2.c2)) - Sort Key: (postgres_fdw_abs(ft2.c2)), ft2.c1 - -> Foreign Scan on public.ft2 - Output: ft2.*, ft2.c1, ft2.c2, postgres_fdw_abs(ft2.c2) - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDERBY "C 1" ASC NULLS LAST -(32 rows) + Hash Cond: (ft2.c1 = ft1.c1) + -> Foreign Scan on public.ft2 + Output: ft2.*, ft2.c1, ft2.c2 + Filter: (ft2.c1 = postgres_fdw_abs(ft2.c2)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" ORDER BY "C 1" ASC NULLSLAST + -> Hash + Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.* + -> Foreign Scan on public.ft1 + Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.* + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) +(21 rows) ALTER SERVER loopback OPTIONS (DROP fdw_startup_cost); ALTER SERVER loopback OPTIONS (ADD extensions 'postgres_fdw');
pgsql-hackers by date: