Thread: BUG #3588: coalesce not working in join

BUG #3588: coalesce not working in join

From
"Richard Harris"
Date:
The following bug has been logged online:

Bug reference:      3588
Logged by:          Richard Harris
Email address:      richard_haris@adp.com
PostgreSQL version: 8.2.4
Operating system:   Linux
Description:        coalesce not working in join
Details:

/*

The queries, query1 and query2, below are identical except that query1 has a
'left join' where query2 has a 'join'. Both queries return three rows.
However query2 (with the 'join') returns all non-null values in column
t1b_pkt1 where query 1 with the left join returns some null values.
This is the behavior when the queries are run on PG 8.2.4. When the queries
are run on PG 8.0.3, both queries return the same results as query2 with the
'join'.

Is PG 8.2.4 behavior correct?

*/
-- Create and populate tables

create table t1 (pkt1 int);
create table t2 (pkt2 int, pkt1 int);


insert into t1 (pkt1) values (1);
insert into t1 (pkt1) values (2);
insert into t1 (pkt1) values (3);

insert into t2 (pkt2, pkt1) values (101, 1);
insert into t2 (pkt2, pkt1) values (201, 2);
insert into t2 (pkt2, pkt1) values (301, NULL);

-- query1
select t2.pkt2 as t2_pkt2, t2.pkt1 as t2_pkt1, t2a.pkt2 as t2a_pkt2,
t1a.pkt1 as t1a_pkt1, t1b.pkt1 as t1b_pkt1
from t2
left join (SELECT pkt2, pkt1 FROM t2 where pkt2 = 101 ) as t2a ON t2.pkt2 =
t2a.pkt2
left join t1 t1a on t2a.pkt1 = t1a.pkt1
left join t1 t1b on coalesce (t1a.pkt1, 1) = t1b.pkt1

-- query2
select t2.pkt2 as t2_pkt2, t2.pkt1 as t2_pkt1, t2a.pkt2 as t2a_pkt2,
t1a.pkt1 as t1a_pkt1, t1b.pkt1 as t1b_pkt1
from t2
left join (SELECT pkt2, pkt1 FROM t2 where pkt2 = 101 ) as t2a ON t2.pkt2 =
t2a.pkt2
left join t1 t1a on t2a.pkt1 = t1a.pkt1
join t1 t1b on coalesce (t1a.pkt1, 1) = t1b.pkt1

Re: BUG #3588: coalesce not working in join

From
Tom Lane
Date:
"Richard Harris" <richard_haris@adp.com> writes:
> Is PG 8.2.4 behavior correct?

Nope :-(  ... looks like some sorta bug in the rules for when outer
joins can be rearranged.  It's doing the t1a/t1b join first, and
then taking that as the right side of the join to t2, which means
that the t1b column goes to null if there's not a match to t2.pkt2.
Needs some study ...

            regards, tom lane

Re: BUG #3588: coalesce not working in join

From
Tom Lane
Date:
"Richard Harris" <richard_haris@adp.com> writes:
> The queries, query1 and query2, below are identical except that query1 has a
> 'left join' where query2 has a 'join'. Both queries return three rows.
> However query2 (with the 'join') returns all non-null values in column
> t1b_pkt1 where query 1 with the left join returns some null values.
> This is the behavior when the queries are run on PG 8.2.4. When the queries
> are run on PG 8.0.3, both queries return the same results as query2 with the
> 'join'.

After poking at this for a bit I have concluded that the manipulations
done in make_outerjoininfo() have a fundamental error: when it concludes
that two outer joins are unsafe to commute, it enforces this by adding
the lower OJ's min_lefthand+min_righthand relation sets to the
min_lefthand of the upper OJ.  This is inadequate when more than 2 OJs
are involved: in this example, it leaves us thinking that we can legally
apply the t2/t2a outer join after the t1a/t1b join, because the t2a/t1a
join's min_lefthand excludes t2 (since it *can* commute with the t2/t2a
join).  It looks to me like we need to add the full *syntactic* extent
of the lower OJ to the upper min_lefthand.  This cannot be done with the
OuterJoinInfo data structure as it stands, because it doesn't track the
syntactic extent only the minimum relsets.  (I had hoped we didn't need
to store that info throughout planning, but that was clearly
overoptimistic.)  Fortunately this data structure is just a planner
local and isn't stored on disk, so we can fix it without initdb.

It's possible that the syntactic-extent rule is stronger than necessary
and some intermediate compromise could be found.  However, non-strict
join conditions are the exception not the rule, so it's probably not
worth expending a whole lot of skull sweat on this case.

The other half of the testing, where we decide that it's OK to commute
with something in our RHS, should probably be using syntactic extent not
minimum relsets as well.

It strikes me also that the bug we fixed awhile back with intervening
non-strict conditions in the RHS-test is probably lurking in the
LHS-test too ... I had thought that it wasn't an issue there but
now I can't reconstruct an argument why not.

Comments anyone?

            regards, tom lane