Thread: BUG #3588: coalesce not working in join
The following bug has been logged online: Bug reference: 3588 Logged by: Richard Harris Email address: richard_haris@adp.com PostgreSQL version: 8.2.4 Operating system: Linux Description: coalesce not working in join Details: /* The queries, query1 and query2, below are identical except that query1 has a 'left join' where query2 has a 'join'. Both queries return three rows. However query2 (with the 'join') returns all non-null values in column t1b_pkt1 where query 1 with the left join returns some null values. This is the behavior when the queries are run on PG 8.2.4. When the queries are run on PG 8.0.3, both queries return the same results as query2 with the 'join'. Is PG 8.2.4 behavior correct? */ -- Create and populate tables create table t1 (pkt1 int); create table t2 (pkt2 int, pkt1 int); insert into t1 (pkt1) values (1); insert into t1 (pkt1) values (2); insert into t1 (pkt1) values (3); insert into t2 (pkt2, pkt1) values (101, 1); insert into t2 (pkt2, pkt1) values (201, 2); insert into t2 (pkt2, pkt1) values (301, NULL); -- query1 select t2.pkt2 as t2_pkt2, t2.pkt1 as t2_pkt1, t2a.pkt2 as t2a_pkt2, t1a.pkt1 as t1a_pkt1, t1b.pkt1 as t1b_pkt1 from t2 left join (SELECT pkt2, pkt1 FROM t2 where pkt2 = 101 ) as t2a ON t2.pkt2 = t2a.pkt2 left join t1 t1a on t2a.pkt1 = t1a.pkt1 left join t1 t1b on coalesce (t1a.pkt1, 1) = t1b.pkt1 -- query2 select t2.pkt2 as t2_pkt2, t2.pkt1 as t2_pkt1, t2a.pkt2 as t2a_pkt2, t1a.pkt1 as t1a_pkt1, t1b.pkt1 as t1b_pkt1 from t2 left join (SELECT pkt2, pkt1 FROM t2 where pkt2 = 101 ) as t2a ON t2.pkt2 = t2a.pkt2 left join t1 t1a on t2a.pkt1 = t1a.pkt1 join t1 t1b on coalesce (t1a.pkt1, 1) = t1b.pkt1
"Richard Harris" <richard_haris@adp.com> writes: > Is PG 8.2.4 behavior correct? Nope :-( ... looks like some sorta bug in the rules for when outer joins can be rearranged. It's doing the t1a/t1b join first, and then taking that as the right side of the join to t2, which means that the t1b column goes to null if there's not a match to t2.pkt2. Needs some study ... regards, tom lane
"Richard Harris" <richard_haris@adp.com> writes: > The queries, query1 and query2, below are identical except that query1 has a > 'left join' where query2 has a 'join'. Both queries return three rows. > However query2 (with the 'join') returns all non-null values in column > t1b_pkt1 where query 1 with the left join returns some null values. > This is the behavior when the queries are run on PG 8.2.4. When the queries > are run on PG 8.0.3, both queries return the same results as query2 with the > 'join'. After poking at this for a bit I have concluded that the manipulations done in make_outerjoininfo() have a fundamental error: when it concludes that two outer joins are unsafe to commute, it enforces this by adding the lower OJ's min_lefthand+min_righthand relation sets to the min_lefthand of the upper OJ. This is inadequate when more than 2 OJs are involved: in this example, it leaves us thinking that we can legally apply the t2/t2a outer join after the t1a/t1b join, because the t2a/t1a join's min_lefthand excludes t2 (since it *can* commute with the t2/t2a join). It looks to me like we need to add the full *syntactic* extent of the lower OJ to the upper min_lefthand. This cannot be done with the OuterJoinInfo data structure as it stands, because it doesn't track the syntactic extent only the minimum relsets. (I had hoped we didn't need to store that info throughout planning, but that was clearly overoptimistic.) Fortunately this data structure is just a planner local and isn't stored on disk, so we can fix it without initdb. It's possible that the syntactic-extent rule is stronger than necessary and some intermediate compromise could be found. However, non-strict join conditions are the exception not the rule, so it's probably not worth expending a whole lot of skull sweat on this case. The other half of the testing, where we decide that it's OK to commute with something in our RHS, should probably be using syntactic extent not minimum relsets as well. It strikes me also that the bug we fixed awhile back with intervening non-strict conditions in the RHS-test is probably lurking in the LHS-test too ... I had thought that it wasn't an issue there but now I can't reconstruct an argument why not. Comments anyone? regards, tom lane