AW: A fine point about OUTER JOIN semantics - Mailing list pgsql-hackers

From Zeugswetter Andreas SB
Subject AW: A fine point about OUTER JOIN semantics
Date
Msg-id 11C1E6749A55D411A9670001FA687963368065@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
Responses Re: AW: A fine point about OUTER JOIN semantics
List pgsql-hackers
I can tell you the results Informix produces:

> Am I right in thinking that the WHERE clause of a query must logically
> be applied *after* any joins specified in the FROM clause?
> 
> For example, suppose that we have table t1 (x int) containing the
> values 1, 2, 3, 4, and table t2 (y int) containing the values 1, 2, 4.
> It's clear that the result of
>     SELECT * FROM t1 LEFT JOIN t2 ON (x = y);
> should be
>     x    y
> 
>     1    1
>     2    2
>     3    NULL
>     4    4

same

> 
> But suppose we make the query
>     SELECT * FROM t1 LEFT JOIN t2 ON (x = y) WHERE y <> 2;
> It seems to me this should yield
>     x    y
> 
>     1    1
>     3    NULL
>     4    4
> 
> and not
>     x    y
> 
>     1    1
>     2    NULL
>     3    NULL
>     4    4
         x           y
         1           1         4           4

> 
> which is what you'd get if the y=2 tuple were filtered out before
> reaching the left-join stage.  Does anyone read the spec differently,
> or get the latter result from another implementation?
> 
> The reason this is interesting is that this example breaks a rather
> fundamental assumption in our planner/optimizer, namely that WHERE
> conditions can be pushed down to the lowest level at which all the
> variables they mention are available.  Thus the planner would normally
> apply "y <> 2" during its bottom-level scan of t2, which 
> would cause the
> LEFT JOIN to decide that x = 2 is an unmatched value, and thus produce
> a "2 NULL" output row.
> 
> An even more interesting example is
>     SELECT * FROM t1 FULL JOIN t2 ON (x = y AND y <> 2);
> My interpretation is that this should produce
>     x    y
> 
>     1    1
>     2    NULL
>     NULL    2
>     3    NULL
>     4    4
         x           y
         1           1         4           4

> since both t1's x=2 and t2's y=2 tuple will appear "unmatched".
> This is *not* the same output you'd get from
>     SELECT * FROM t1 FULL JOIN t2 ON (x = y) WHERE y <> 2;
> which I think should yield
>     x    y
> 
>     1    1
>     3    NULL
>     4    4
> This shows that JOIN/ON conditions for outer joins are not 
> semantically
> interchangeable with WHERE conditions.
         x           y
         1           1         4           4

> 
> This is going to be a bit of work to fix, so I thought I'd better
> confirm that I'm reading the spec correctly before I dive into it.

No idea if they interpret correctly, but seems they hand it interchangeably.
Someone want to check Oracle and MS Sql ?

Andreas


pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: Re: A fine point about OUTER JOIN semantics
Next
From: Tom Lane
Date:
Subject: RPMs and symlinks (was Re: [NOVICE] C++ library probs)