Thread: OUTER JOIN problem

OUTER JOIN problem

From
Zoltan Boszormenyi
Date:
Hi,

I have a problem with LEFT OUTER JOIN, not only in PostgreSQL
but in Informix, too. I thought someone can explain the "bug"
I am seeing. Let me qoute my psql session, forgive me, it's a
bit long:

$ LANG=C psql -h localhost -U postgres postgres
Welcome to psql 7.3.4-RH, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms       \h for help with SQL commands       \? for help on internal slash
commands      \g or terminate with semicolon to execute query       \q to quit
 

postgres=# create table a (i serial, t varchar(40));
JELZÉS:  CREATE TABLE will create implicit sequence 'a_i_seq' for SERIAL
column 'a.i'
CREATE TABLE
postgres=# create table b (i serial, t varchar(40));
JELZÉS:  CREATE TABLE will create implicit sequence 'b_i_seq' for SERIAL
column 'b.i'
CREATE TABLE
postgres=# create table c (i serial, a integer, b integer);
JELZÉS:  CREATE TABLE will create implicit sequence 'c_i_seq' for SERIAL
column 'c.i'
CREATE TABLE

... Insert some records into all three tables ...  (Actually table 'b' is not used in the SELECTs,   table 'c' would
storeconnections between 'a' and 'b'   in the application's broader context.)
 

postgres=# select * from a; i | t
---+--- 1 | 1 2 | 2 3 | 3 4 | 4
(4 rows)

postgres=# select * from b; i | t
---+--- 1 | 5 2 | 6 3 | 7 4 | 8
(4 rows)

postgres=# select * from c; i | a | b
---+---+--- 1 | 1 | 1 2 | 1 | 2 3 | 1 | 4 4 | 2 | 3 5 | 3 | 1 6 | 3 | 2 7 | 4 | 4
(7 rows)

postgres=# select * from a left outer join c on (a.i=c.a) where c.b is 
null or c.b=1; i | t | i | a | b
---+---+---+---+--- 1 | 1 | 1 | 1 | 1 3 | 3 | 5 | 3 | 1
(2 rows)

Let's try something:

postgres=# delete from c;
DELETE 7
postgres=# insert into c (a,b) values (1,1);
INSERT 18490 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is 
null or c.b=1; i | t | i | a | b
---+---+---+---+--- 1 | 1 | 8 | 1 | 1 2 | 2 |   |   | 3 | 3 |   |   | 4 | 4 |   |   |
(4 rows)

postgres=# insert into c (a,b) values (1,3);
INSERT 18491 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is 
null or c.b=1; i | t | i | a | b
---+---+---+---+--- 1 | 1 | 8 | 1 | 1 2 | 2 |   |   | 3 | 3 |   |   | 4 | 4 |   |   |
(4 rows)

postgres=# insert into c (a,b) values (3,1);
INSERT 18492 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is 
null or c.b=1; i | t | i  | a | b
---+---+----+---+--- 1 | 1 |  8 | 1 | 1 2 | 2 |    |   | 3 | 3 | 10 | 3 | 1 4 | 4 |    |   |
(4 rows)

Now I get the results I want. Let's insert some more data:

postgres=# insert into c (a,b) values (2,3);
INSERT 18494 1
postgres=# insert into c (a,b) values (3,1);
INSERT 18495 1
postgres=# insert into c (a,b) values (4,4);
INSERT 18496 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is 
null or c.b=1; i | t | i  | a | b
---+---+----+---+--- 1 | 1 |  8 | 1 | 1 3 | 3 | 10 | 3 | 1 3 | 3 | 13 | 3 | 1
(3 rows)

Again I don't get the data I want. I accidentally inserted duplicated
data, (a,b)=(3,1) twice. (Working without unique indexes may backfire.)
The original dataset at the beginning of my example did not contain
duplicated data.

I don't know how PostgreSQL works internally but this bug *must* be
conforming to some standard if two distinct SQL server products behave
(almost) the same. I said almost, I discovered the same annoyance today
on an Informix 9.21 running under SCO UNIX on a slightly larger dataset
with less than 70 rows. It just left out some arbitrary rows that had
NULLs from the right side table (i.e not existing rows).

The following (not exactly SQL conform) construct works in Informix 9.21
and always gives me all the rows I wanted and no more:

select * from a, outer b where a.i=c.a and (c.b is null or c.b=1);

I know I can substitute OUTER JOIN with a UNION of an INNER JOIN and
a SELECT from the left table WHERE NOT EXISTS (SELECT from the right-
side table WHERE condition). But that's the point of the OUTER JOIN,
isn't it? Now can someone tell me whether it is a real bug in BOTH SQL
servers? Or is it a conforming behaviour to some part of the SQL
standard? Then please, point me where to RTFM?

I am not on the list, please Cc: me.

Thanks in advance,
Zoltán Böszörményi




Re: OUTER JOIN problem

From
Stephan Szabo
Date:
On Wed, 23 Jun 2004, Zoltan Boszormenyi wrote:

> I don't know how PostgreSQL works internally but this bug *must* be
> conforming to some standard if two distinct SQL server products behave
> (almost) the same. I said almost, I discovered the same annoyance today
> on an Informix 9.21 running under SCO UNIX on a slightly larger dataset
> with less than 70 rows. It just left out some arbitrary rows that had
> NULLs from the right side table (i.e not existing rows).

It's not a bug.  The outerness of the join is defined over the join
conditions (the ON portion), not the where conditions.  In the first case
you got rows from the join that matched the join condition like (a.i=2,
c.a=1, c.b=2) which the where then filtered out.

I thinkselect * from a left outer join c on (a.i=c.a and c.b=1)
meets the standard and may give you what you want.