Ignored join clause - Mailing list pgsql-bugs

From Andreas Karlsson
Subject Ignored join clause
Date
Msg-id f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
Whole thread Raw
Responses Re: Ignored join clause
List pgsql-bugs
Hi,

It seems to me like PostgreSQL incorrectly removes a join clause when 
planning some queries. I discovered this while debugging a large query, 
which I below have simplified as much as I could. I suspect the bug may 
be related to the lateral join but I am not sure.

This bug appears in both 10.3 and master (at least).

This works:

CREATE TABLE t AS SELECT 1 AS x, '{10,20}'::int[] AS ys;

SELECT *
FROM t
JOIN (VALUES (1, 10), (2, 20)) AS q1 (x, y) ON q1.x = t.x
LEFT JOIN unnest(ys) q2 (y) ON q2.y = q1.y;

  x |   ys    | x | y  | y
---+---------+---+----+----
  1 | {10,20} | 1 | 10 | 10
(1 row)

This does not:

SELECT *
FROM t
LEFT JOIN (VALUES (1, 10), (2, 20)) AS q1 (x, y) ON q1.x = t.x
LEFT JOIN unnest(ys) q2 (y) ON q2.y = q1.y;

  x |   ys    | x | y  | y
---+---------+---+----+----
  1 | {10,20} | 1 | 10 | 10
  1 | {10,20} | 2 | 20 |
(2 rows)

I expect both these queries to return the same data on this data set,. 
And the second row of the result violates "q1.x = t.x".

JOIN plan:

                                    QUERY PLAN 

---------------------------------------------------------------------------------
  Nested Loop Left Join  (cost=0.05..56.90 rows=13 width=48)
    Output: t.x, t.ys, "*VALUES*".column1, "*VALUES*".column2, q2.y
    Join Filter: (q2.y = "*VALUES*".column2)
    ->  Hash Join  (cost=0.05..27.64 rows=13 width=44)
          Output: t.x, t.ys, "*VALUES*".column1, "*VALUES*".column2
          Hash Cond: (t.x = "*VALUES*".column1)
          ->  Seq Scan on public.t  (cost=0.00..22.70 rows=1270 width=36)
                Output: t.x, t.ys
          ->  Hash  (cost=0.03..0.03 rows=2 width=8)
                Output: "*VALUES*".column1, "*VALUES*".column2
                ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 
width=8)
                      Output: "*VALUES*".column1, "*VALUES*".column2
    ->  Function Scan on pg_catalog.unnest q2  (cost=0.00..1.00 rows=100 
width=4)
          Output: q2.y
          Function Call: unnest(t.ys)
(15 rows)

LEFT JOIN plan:

                                       QUERY PLAN 

---------------------------------------------------------------------------------------
  Nested Loop Left Join  (cost=0.05..1826.15 rows=1270 width=48)
    Output: t.x, t.ys, "*VALUES*".column1, "*VALUES*".column2, q2.y
    ->  Seq Scan on public.t  (cost=0.00..22.70 rows=1270 width=36)
          Output: t.x, t.ys
    ->  Hash Right Join  (cost=0.05..1.45 rows=2 width=12)
          Output: "*VALUES*".column1, "*VALUES*".column2, q2.y
          Hash Cond: (q2.y = "*VALUES*".column2)
          Join Filter: ("*VALUES*".column1 = t.x)
          ->  Function Scan on pg_catalog.unnest q2  (cost=0.00..1.00 
rows=100 width=4)
                Output: q2.y
                Function Call: unnest(t.ys)
          ->  Hash  (cost=0.03..0.03 rows=2 width=8)
                Output: "*VALUES*".column1, "*VALUES*".column2
                ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 
width=8)
                      Output: "*VALUES*".column1, "*VALUES*".column2
(15 rows)

Andreas


pgsql-bugs by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: BUG #15159: Duplicate records for same primary key
Next
From: Andrew Gierth
Date:
Subject: Re: Ignored join clause