Thread: Non-trivial condition is only propagated to one side of JOIN
Hi, using `PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit`, I've observed the following behavior: – keep in mind that this example is as simplified as possible, the original query involves foreign tables, and the failure to propagate / push down the condition results in a query plan that basically tries to download the complete foreign table, which is not a feasible execution strategy: Setup: CREATE TABLE tbl1 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id INTEGER NOT NULL, data TEXT); CREATE TABLE tbl2 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id INTEGER NOT NULL, data TEXT); CREATE INDEX ON tbl1 (site_id); CREATE INDEX ON tbl2 (site_id); Working queries: SELECT * FROM tbl1 WHERE tbl1.site_id = 1; -- "trivial condition" SELECT * FROM tbl2 WHERE tbl2.site_id = 1; SELECT * FROM tbl1 WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; -- "non-trivial condition" SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL; 1) Exemplary Query Plan: # EXPLAIN SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL; QUERY PLAN ------------------------------------------------------------------------------------- Bitmap Heap Scan on tbl2 (cost=8.40..19.08 rows=12 width=40) Recheck Cond: ((site_id = 1) OR (site_id IS NULL)) -> BitmapOr (cost=8.40..8.40 rows=12 width=0) -> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20 rows=6 width=0) Index Cond: (site_id = 1) -> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20 rows=6 width=0) Index Cond: (site_id IS NULL) (7 rows) The key takeaway is, that the index can be used, because the condition is propagated deep enough. 2) Still working example: # EXPLAIN SELECT * FROM tbl1 LEFT JOIN tbl2 ON tbl2.site_id = tbl1.site_id WHERE tbl1.site_id = 1; QUERY PLAN ------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=8.40..27.80 rows=36 width=80) -> Bitmap Heap Scan on tbl1 (cost=4.20..13.67 rows=6 width=40) Recheck Cond: (site_id = 1) -> Bitmap Index Scan on tbl1_site_id_idx (cost=0.00..4.20 rows=6 width=0) Index Cond: (site_id = 1) -> Materialize (cost=4.20..13.70 rows=6 width=40) -> Bitmap Heap Scan on tbl2 (cost=4.20..13.67 rows=6 width=40) Recheck Cond: (site_id = 1) -> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20 rows=6 width=0) Index Cond: (site_id = 1) (10 rows) The condition is propagated into BOTH branches of the join. The join could also be an INNER join and might also be realized as a Merge Join or Hash Join: they all behave the same. 3) Problematic example: # EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; QUERY PLAN ------------------------------------------------------------------------------------------------- Hash Join (cost=19.23..46.45 rows=72 width=80) Hash Cond: (tbl2.site_id = tbl1.site_id) -> Seq Scan on tbl2 (cost=0.00..22.00 rows=1200 width=40) -> Hash (cost=19.08..19.08 rows=12 width=40) -> Bitmap Heap Scan on tbl1 (cost=8.40..19.08 rows=12 width=40) Recheck Cond: ((site_id = 1) OR (site_id IS NULL)) -> BitmapOr (cost=8.40..8.40 rows=12 width=0) -> Bitmap Index Scan on tbl1_site_id_idx (cost=0.00..4.20 rows=6 width=0) Index Cond: (site_id = 1) -> Bitmap Index Scan on tbl1_site_id_idx (cost=0.00..4.20 rows=6 width=0) Index Cond: (site_id IS NULL) (11 rows) Now, a full seq scan used for tbl2, the condition is only pushed down on ONE side of the JOIN! (with `WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL`, the Seq Scan would have been on tbl1... [not so easily demostrated w/ LEFT JOINs]). Also, `ON tbl1.site_id IS NOT DISTINCT FROM tbl2.site_id` does not help, The weird thing is: The subqueries on both sides of the join are perfectly capable of accepting/using the "non-trivial" condition, as demonstrated in 1), and JOINs are generally able to propagate conditions to both sides, as demonstrated in 2). Is there a magic knob to force postgres to do the right thing, or is this basically a bug in the query planner? Tobias
3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
Also, `ON tbl1.site_id IS NOT DISTINCT FROM tbl2.site_id` does not help,
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Sunday, August 25, 2024, Tobias Hoffmann <ldev-list@thax.hardliners.org> > wrote: >> 3) Problematic example: >> >> # EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id >> WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; > The “is null” predicate in this query is doing nothing as your next comment > alludes to; you will produce no rows out of the join with a null site_id > due to the use of the equals operator in the join. Indeed. This WHERE clause might be useful with a left join to tbl1, but then the IS NULL means something totally different and can *not* be pushed down. > Others may correct me but I’m guessing that indeed the optimizer has a gap > here that could be filled in, it’s just it feels like adding code to deal > with broken queries so isn’t overly motivated to work on. The short answer is that we expend quite a lot of effort to deduce implied equality clauses from combinations of equality clauses; that is, given "WHERE A = B AND B = C" we can deduce "A = C" as well. No such deductions can be made from equality clauses that are buried under an OR, because they might not be true for every join row. Maybe some machinery could be built that would do something useful with an OR clause of this form, but I doubt it would be useful often enough to justify the development effort and additional planner cycles. An important point here is that "WHERE A = B AND p(A)" does not permit us to deduce "p(B)" for arbitrary conditions p(), because we have some equality operators that will return true for values that sort equal but are distinguishable in other ways. (Handy example: in float8 arithmetic, zero and minus zero compare equal, as required by the IEEE float spec.) We presently don't assume that this works for anything other than transitive equality across equality operators of a single btree operator family. In particular, I don't think we could assume it for the given example, because "WHERE A = B AND A IS NULL" is tautologically false. You can deduce anything from a falsehood, so I'm not entirely sure that the proposed optimization is even logically sound. > Joining using > distinct instead of equality is uncommon, since nearly all models join > primary keys to foreign keys and both of those are almost always non-null. Yeah. I'll concede that we probably should work harder on building out planner and executor support for IS NOT DISTINCT FROM. But again, it just seems like a case that is not worth spending large amounts of development time on. regards, tom lane
On Sunday, August 25, 2024, Tobias Hoffmann <ldev-list@thax.hardliners.org> wrote:
3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;The “is null” predicate in this query is doing nothing as your next comment alludes to; you will produce no rows out of the join with a null site_id due to the use of the equals operator in the join.
Well, that's why I said: "keep in mind that this example is as simplified as possible"...
Even though `tbl1.site_id = 1` is – in this case – completely equivalent to `tbl1.site_id = 1 OR tbl1.site_id IS NULL` – the first one is completely pushed down, but the second is not.
A more complete example might look more like this:
CREATE VIEW "subview1" AS
SELECT tbl1.site_id, ... JOIN ... ON tbl1.site_id = tbl2.site_id WHERE ...;
CREATE VIEW "view1" AS
SELECT site_id, ... FROM subview1 -- maybe even: WHERE site_id IS NOT NULL
UNION ALL
SELECT null, ...;
SELECT * FROM view1 WHERE (site_id = 1 OR site_id IS NULL);
The reason, why the outer query would have a more complicated condition might have nothing to do with the subquery containing the JOIN.
(This is also not a `UNION ALL` special case: `site_id IS NULL` could also be generated by a LEFT JOIN, e.g.)
But not pushing down the condition has the grave impact, that those - otherwise working VIEWs (i.e. subview1) become "unfixably" broken, for certain WHERE-conditions on the outside.
Another reason why I said `site_id INTEGER NOT NULL` and `IS NOT DISTINCT FROM`, is that there might be some mathematical reason I'm not yet aware of where the propagation would not be sound, but which would not apply for arbitrary site_id [nullable, ...].
My primary goal is to find at least *some* way to get the condition pushed further in to avoid the full table scan, and to not have to completely rewrite all the VIEWs into a single big query, where I could inject the site_id parameter (e.g. "$1") in multiple places as needed...
Tobias
Tobias Hoffmann <ldev-list@thax.hardliners.org> writes: > A more complete example might look more like this: > CREATE VIEW "subview1" AS > SELECT tbl1.site_id, ... JOIN ... ON tbl1.site_id = tbl2.site_id > WHERE ...; > CREATE VIEW "view1" AS > SELECT site_id, ... FROM subview1 -- maybe even: WHERE site_id IS > NOT NULL > UNION ALL > SELECT null, ...; > SELECT * FROM view1 WHERE (site_id = 1 OR site_id IS NULL); For this particular case, you could probably get somewhere by writing SELECT * FROM view1 WHERE site_id = 1 UNION ALL SELECT * FROM view1 WHERE site_id IS NULL; since the sets of rows satisfying those two WHERE conditions must be disjoint. (I recall working on a patch that essentially tried to do that transformation automatically, but it eventually failed because things get too messy if the row sets might not be disjoint.) regards, tom lane
On 25/08/2024 19:28, Tom Lane wrote: > For this particular case, you could probably get somewhere by > writing > > SELECT * FROM view1 WHERE site_id = 1 > UNION ALL > SELECT * FROM view1 WHERE site_id IS NULL; > Thank you for your suggestion, Tom. Unfortunately, as I now understand, nothing *except* `var = const` can ever be propagated to the second branch of the join. In particular, even just `WHERE site_id IS NULL` no longer propagates like `WHERE site_id = 1` does. Other cases, that do not propagate, include `WHERE site_id IN (1, 2)`. I'll probably have to find another workaround to my current problem. ---- More generally, I think that the currently possible set is very restrictive and affects not just edge-cases; my SQL-Engine-implementation-fu is far from good enough for the necessary changes, though. Here are some more thoughts: > Maybe some machinery could be built that would do something useful > with an OR clause of this form, > > [...] > An important point here is that "WHERE A = B AND p(A)" does not permit > us to deduce "p(B)" for arbitrary conditions p(), IMO the relevant equality should be the `ON tbl2.site_id IS NOT DISTINCT FROM tbl1.site_id` (resp. `ON tbl2.site_id = tbl1.site_id`, but nulls probably need special care), which should allow any predicate `p` only depending on `tbl1.site_id` (i.e.`WHERE p(tbl1.site_id)`, from "outside") to be pulled "inside" the INNER JOIN or LEFT JOIN, because no row which does not satisfy `p(tbl1.site_id)`, and, via equivalence, `p(tbl2.site_id)` could ever be part of the result. More specifically, given a WHERE clause in CNF (i.e. `p0(...) AND p1(...) AND (p2a(...) OR p2b(...)) AND ...`), every top-level term which only uses variables which are deemed equivalent, should be allowed to propagate. If this is too difficult, not just single constants (`var = const`), but sets of constants (`var = ANY(...)`) and/or especially `var IS NULL`) should be considered. Just my 2ct... Tobias
Re: Non-trivial condition is only propagated to one side of JOIN
You must use a where clause on the FDW table or you get a full load/SEQ scan of that table, per documentation.
Select * is not recommended for FDW tables.
From: Tobias Hoffmann <ldev-list@thax.hardliners.org>
Date: Sunday, August 25, 2024 at 8:10 AM
To: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Subject: Non-trivial condition is only propagated to one side of JOIN
Hi, using `PostgreSQL 16. 2 (Debian 16. 2-1. pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12. 2. 0-14) 12. 2. 0, 64-bit`, I've observed the following behavior: – keep in mind that this example is as simplified as possible, the original
Hi,
using `PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit`, I've observed the
following behavior:
– keep in mind that this example is as simplified as possible, the
original query involves foreign tables, and the failure to propagate /
push down the condition results in a query plan that basically tries to
download the complete foreign table, which is not a feasible execution
strategy:
Setup:
CREATE TABLE tbl1 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id
INTEGER NOT NULL, data TEXT);
CREATE TABLE tbl2 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id
INTEGER NOT NULL, data TEXT);
CREATE INDEX ON tbl1 (site_id);
CREATE INDEX ON tbl2 (site_id);
Working queries:
SELECT * FROM tbl1 WHERE tbl1.site_id = 1; -- "trivial condition"
SELECT * FROM tbl2 WHERE tbl2.site_id = 1;
SELECT * FROM tbl1 WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; --
"non-trivial condition"
SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL;
1) Exemplary Query Plan:
# EXPLAIN SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL;
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on tbl2 (cost=8.40..19.08 rows=12 width=40)
Recheck Cond: ((site_id = 1) OR (site_id IS NULL))
-> BitmapOr (cost=8.40..8.40 rows=12 width=0)
-> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id = 1)
-> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id IS NULL)
(7 rows)
The key takeaway is, that the index can be used, because the condition
is propagated deep enough.
2) Still working example:
# EXPLAIN SELECT * FROM tbl1 LEFT JOIN tbl2 ON tbl2.site_id =
tbl1.site_id WHERE tbl1.site_id = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=8.40..27.80 rows=36 width=80)
-> Bitmap Heap Scan on tbl1 (cost=4.20..13.67 rows=6 width=40)
Recheck Cond: (site_id = 1)
-> Bitmap Index Scan on tbl1_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id = 1)
-> Materialize (cost=4.20..13.70 rows=6 width=40)
-> Bitmap Heap Scan on tbl2 (cost=4.20..13.67 rows=6 width=40)
Recheck Cond: (site_id = 1)
-> Bitmap Index Scan on tbl2_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id = 1)
(10 rows)
The condition is propagated into BOTH branches of the join. The join
could also be an INNER join and might also be realized as a Merge Join
or Hash Join: they all behave the same.
3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id
WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Hash Join (cost=19.23..46.45 rows=72 width=80)
Hash Cond: (tbl2.site_id = tbl1.site_id)
-> Seq Scan on tbl2 (cost=0.00..22.00 rows=1200 width=40)
-> Hash (cost=19.08..19.08 rows=12 width=40)
-> Bitmap Heap Scan on tbl1 (cost=8.40..19.08 rows=12 width=40)
Recheck Cond: ((site_id = 1) OR (site_id IS NULL))
-> BitmapOr (cost=8.40..8.40 rows=12 width=0)
-> Bitmap Index Scan on tbl1_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id = 1)
-> Bitmap Index Scan on tbl1_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id IS NULL)
(11 rows)
Now, a full seq scan used for tbl2, the condition is only pushed down on
ONE side of the JOIN!
(with `WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL`, the Seq Scan
would have been on tbl1... [not so easily demostrated w/ LEFT JOINs]).
Also, `ON tbl1.site_id IS NOT DISTINCT FROM tbl2.site_id` does not help,
The weird thing is: The subqueries on both sides of the join are
perfectly capable of accepting/using the "non-trivial" condition, as
demonstrated in 1), and JOINs are generally able to propagate conditions
to both sides, as demonstrated in 2).
Is there a magic knob to force postgres to do the right thing, or is
this basically a bug in the query planner?
Tobias