Re: Trouble with strange OUTER JOIN syntax - Mailing list pgsql-general

From Tom Lane
Subject Re: Trouble with strange OUTER JOIN syntax
Date
Msg-id 6268.990821407@sss.pgh.pa.us
Whole thread Raw
In response to Re: Trouble with strange OUTER JOIN syntax  (Farid Hajji <farid.hajji@ob.kamp.net>)
Responses Re: Trouble with strange OUTER JOIN syntax  (Farid Hajji <farid.hajji@ob.kamp.net>)
List pgsql-general
Farid Hajji <farid.hajji@ob.kamp.net> writes:
> The next examples seem to generalize this: two tables (tab3 and tab4)
> are being outer-joined to existing cartesian product tab1 x tab2.
> I'm not sure what the difference may be between:
>   OUTER (tab3, tab4)
> and
>   OUTER tab3, OUTER tab4.

I suspect that these correspond to the standard syntaxes

SELECT ... FROM
    (tab1 JOIN tab2 ON condition12)
    LEFT JOIN
    (tab3 JOIN tab4 ON condition34)
    ON condition1234;

and

SELECT ... FROM
    ((tab1 JOIN tab2 ON condition12)
    LEFT JOIN
    tab3 ON condition123)
    LEFT JOIN
    tab4 ON condition1234;

respectively.  In the first case, tab1 is inner-joined to tab2 and
tab3 is separately inner-joined to tab4, then the results are
outer-joined (with the tab3*tab4 product being the nullable side).
In the second case, tab1 is inner-joined to tab2, then tab3 is
outer-joined to this product, and finally tab4 is outer-joined to
the result.  Obviously these orderings can yield different results
because of null-row addition (whereas it wouldn't really matter if
all the joins were inner joins).

Note that the standard syntax makes you attach a join condition
(for example, "tab1.x = tab2.y") to each of these operations, rather
than intuiting which parts of the WHERE clause are to be taken as the
join condition.  Again, this wouldn't matter for inner joins but it
makes a big difference for outer joins.  Example:

select * from tab1 left join tab2 on (tab1.a = tab2.b and tab2.c = 0);

is not at all the same as

select * from tab1 left join tab2 on (tab1.a = tab2.b) where tab2.c = 0;

The WHERE clause is not the join condition, but is applied after the
join is done (and null rows are inserted).  So, for example, if tab1
contains just A=1 and tab2 contains just B=1, C=2, the first case
produces output 1,NULL,NULL because there are no tab2 rows that meet
the outer-join condition with tab1's row.  But the second case produces
no output rows at all --- the outer-join produces 1,1,2 which is then
removed by the WHERE filter.

The standard's syntax is rather verbose and ugly, but it has the great
virtue of handling outer joins unambiguously.  None of the vendor-
specific syntaxes I've seen are very clear about the implications of
an outer join condition.

BTW, you need PG 7.1 or later to work with outer joins.

            regards, tom lane

pgsql-general by date:

Previous
From: Shaun Thomas
Date:
Subject: Weird query execution paths, ignoring indexes...
Next
From: Feite Brekeveld
Date:
Subject: Re: pl-perl setup?