On 7/1/24 16:17, Andrei Lepikhov wrote:
> On 10/12/23 14:52, Andy Fan wrote:
>> Here the sublink can't be pulled up because of its reference to
>> the LHS of left join, the original logic is that no matter the 'b.t
>> in ..'
>> returns the true or false, the rows in LHS will be returned. If we
>> pull it up to LHS, some rows in LHS will be filtered out, which
>> breaks its original semantics.
> Hi,
> I spent some time trying to understand your sentence.
> I mean the following case:
>
> SELECT * FROM t1 LEFT JOIN t2
> ON t2.x IN (SELECT y FROM t3 WHERE t1.x=t3.x);
>
> I read [1,2,3], but I am still unsure why it is impossible in the case
> of OUTER JOIN. By setting the LATERAL clause, we forbid any clauses from
> the RTE subquery to bubble up as a top-level clause and filter tuples
> from LHS, am I wrong? Does it need more research or you can show some
> case to support your opinion - why this type of transformation must be
> disallowed?
>
> [1] https://www.postgresql.org/message-id/6531.1218473967%40sss.pgh.pa.us
> [2]
> https://www.postgresql.org/message-id/BANLkTikGFtGnAaXVh5%3DntRdN%2B4w%2Br%3DNPuw%40mail.gmail.com
> [3] https://www.vldb.org/conf/1992/P091.PDF
>
I delved into it a bit more. After reading [4,5] I invented query that
is analogue of the query above, but with manually pulled-up sublink:
EXPLAIN (COSTS OFF)
SELECT * FROM t1 LEFT JOIN t2 JOIN LATERAL
(SELECT t1.x AS x1, y,x FROM t3) q1 ON (t2.x=q1.y AND q1.x1=q1.x) ON true;
And you can see the plan:
Nested Loop Left Join
-> Seq Scan on t1
-> Hash Join
Hash Cond: (t2.x = t3.y)
-> Seq Scan on t2
-> Hash
-> Seq Scan on t3
Filter: (t1.x = x)
Just for fun, I played with MSSQL Server and if I read its explain
correctly, it also allows pulls-up sublink which mentions LHS:
-------------------------------------
Nested Loops(Left Outer Join, OUTER REFERENCES:(t1.x))
Table Scan(OBJECT:(t1))
Hash Match(Right Semi Join, HASH:(t3.y)=(t2.x),
RESIDUAL:(t2.x=t3.y))
Table Scan(OBJECT:(t3), WHERE:(t1.x=t3.x))
Table Scan(OBJECT:(t2))
-------------------------------------
(I cleaned MSSQL explain a little bit for clarity).
So, may we allow references to LHS in such sublink?
[4]
https://www.postgresql.org/message-id/flat/15523.1372190410%40sss.pgh.pa.us
[5]
https://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local
--
regards, Andrei Lepikhov