Thread: Incorrect result in certain expressions of anti-join

Incorrect result in certain expressions of anti-join

From
Torbjörn Gannholm
Date:
To reproduce:

create table foo (a text);
Insert into foo values ('foo');
create table bar (a text);
insert into bar values ('foo'), ('bar');

Now the following expression of anti-join gives the correct result:
select * from bar LEFT JOIN foo ON foo.a = bar.a WHERE foo.a IS NULL;
  a  | a
-----+---
 bar |
(1 row)

However, a slight change and the anti-condition is simply ignored and the unexpected (IMO incorrect) result is returned. I would expect it either to work or to cause an error:
select * from bar LEFT JOIN foo ON foo.a = bar.a AND foo.a IS NULL;
  a  | a
-----+---
 foo |
 bar |
(2 rows)

/tobe

Re: Incorrect result in certain expressions of anti-join

From
Pantelis Theodosiou
Date:


On Tuesday, January 7, 2025, Torbjörn Gannholm <torbjorn.gannholm@gmail.com> wrote:
To reproduce:

create table foo (a text);
Insert into foo values ('foo');
create table bar (a text);
insert into bar values ('foo'), ('bar');

Now the following expression of anti-join gives the correct result:
select * from bar LEFT JOIN foo ON foo.a = bar.a WHERE foo.a IS NULL;
  a  | a
-----+---
 bar |
(1 row)

However, a slight change and the anti-condition is simply ignored and the unexpected (IMO incorrect) result is returned. I would expect it either to work or to cause an error:
select * from bar LEFT JOIN foo ON foo.a = bar.a AND foo.a IS NULL;
  a  | a
-----+---
 foo |
 bar |
(2 rows)


The result is correct. The two queries are not equivalent. The join condition on the second query evaluates to false for all combinations as foo.a has no null values.
 

Re: Incorrect result in certain expressions of anti-join

From
"David G. Johnston"
Date:
On Tuesday, January 7, 2025, Torbjörn Gannholm <torbjorn.gannholm@gmail.com> wrote:

select * from bar LEFT JOIN foo ON foo.a = bar.a WHERE foo.a IS NULL;
select * from bar LEFT JOIN foo ON foo.a = bar.a AND foo.a IS NULL;


The where clause is evaluated after the join and thus the join sees all rows from foo and bar.

When you place the expression in the ON clause it gets evaluated before/during the join, and thus effectively pushed down and the rows of foo are filtered before the join happens.

Separately, I would advise using NOT EXISTS for an anti-join and not mention foo in the from clause at all.  That is the canonical form.

David J.