Re: Non-trivial condition is only propagated to one side of JOIN - Mailing list pgsql-hackers

From Tobias Hoffmann
Subject Re: Non-trivial condition is only propagated to one side of JOIN
Date
Msg-id 13ebf00f-2991-43c5-5d11-a8216cb3217a@thax.hardliners.org
Whole thread Raw
In response to Re: Non-trivial condition is only propagated to one side of JOIN  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Non-trivial condition is only propagated to one side of JOIN
List pgsql-hackers
On 25/08/2024 17:35, David G. Johnston wrote:
On Sunday, August 25, 2024, Tobias Hoffmann <ldev-list@thax.hardliners.org> wrote:

3) Problematic example:

# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; 

The “is null” predicate in this query is doing nothing as your next comment alludes to; you will produce no rows out of the join with a null site_id due to the use of the equals operator in the join.


Well, that's why I said: "keep in mind that this example is as simplified as possible"...
Even though `tbl1.site_id = 1` is – in this case – completely equivalent to `tbl1.site_id = 1 OR tbl1.site_id IS NULL` – the first one is completely pushed down, but the second is not.

A more complete example might look more like this:

CREATE VIEW "subview1" AS
  SELECT tbl1.site_id, ... JOIN ... ON tbl1.site_id = tbl2.site_id WHERE ...; 

CREATE VIEW "view1" AS
  SELECT site_id, ... FROM subview1  -- maybe even: WHERE site_id IS NOT NULL
  UNION ALL
  SELECT null, ...;

SELECT * FROM view1 WHERE (site_id = 1 OR site_id IS NULL);

The reason, why the outer query would have a more complicated condition might have nothing to do with the subquery containing the JOIN.
(This is also not a `UNION ALL` special case: `site_id IS NULL` could also be generated by a LEFT JOIN, e.g.)

But not pushing down the condition has the grave impact, that those - otherwise working VIEWs (i.e. subview1) become "unfixably" broken, for certain WHERE-conditions on the outside.

Another reason why I said `site_id INTEGER NOT NULL` and `IS NOT DISTINCT FROM`, is that there might be some mathematical reason I'm not yet aware of where the propagation would not be sound, but which would not apply for arbitrary site_id [nullable, ...].
My primary goal is to find at least *some* way to get the condition pushed further in to avoid the full table scan, and to not have to completely rewrite all the VIEWs into a single big query, where I could inject the site_id parameter (e.g. "$1") in multiple places as needed...

  Tobias

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Non-trivial condition is only propagated to one side of JOIN
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Better error message when --single is not the first arg to postgres executable