Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions - Mailing list pgsql-docs

From Kevin Grittner
Subject Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions
Date
Msg-id 4DF63F20020000250003E56D@gw.wicourts.gov
Whole thread Raw
In response to Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-docs
Robert Haas <robertmhaas@gmail.com> wrote:
> Grzegorz Szpetkowski  <gszpetkowski@gmail.com> wrote:

>> "The join condition specified with ON can also contain conditions
>> that do not relate directly to the join."

I think the trouble starts with that sentence, which I believe to be
completely false and misleading.

Simplifying a real-life instance of such confusion among our
programmers:

SELECT <Party columns>, Demographic.dob
  FROM Party LEFT JOIN Demographic
  ON (<Party.pkcols = Demographic.pkcols>
        AND Demographic.dod is NULL);

Which makes absence of date of death part of the outer join
criteria.  So you get all the parties, dead or alive; and only show
date of birth for those not known to be dead.  What they really
wanted to do was exclude parties known to be dead, and for those
parties listed, show date of birth if available.  So they wanted:

SELECT <Party columns>, Demographic.dob
  FROM Party LEFT JOIN Demographic
  ON (<Party.pkcols = Demographic.pkcols>)
  WHERE Demographic.dod is NULL;

Conditions in the ON clause *do* relate to the JOIN -- it's just
that the join might be on conditions other than primary key
equality.

Let's not contribute to muddy thinking by making incorrect
statements like that.

> I don't have a clear feeling for exactly what is needed.

I think the thing which is most likely to surprise people is that
the result can contain rows which are not in the Cartesian product
of joining the two relations.  We might want to point that out,
mention that it's an OUTER JOIN in *both* directions, and maybe give
an example which is half-way plausible as a use-case.  Maybe
something similar to:

test=# create table n_en (n int, word text);
CREATE TABLE
test=# create table n_de (n int, wort text);
CREATE TABLE
test=# insert into n_en values (1,'one'),(2,'two');
INSERT 0 2
test=# insert into n_de values (2, 'zwei'),(3,'drei');
INSERT 0 2
test=# select * from n_en full join n_de using (n);
 n | word | wort
---+------+------
 1 | one  |
 2 | two  | zwei
 3 |      | drei
(3 rows)

And that works to show the difference between:

test=# select * from n_en full join n_de
test-# on (n_en.n = n_de.n and n_de.n > 2);
 n | word | n | wort
---+------+---+------
 1 | one  |   |
 2 | two  |   |
   |      | 2 | zwei
   |      | 3 | drei
(4 rows)

and:

test=# select * from n_en full join n_de
test-# on (n_en.n = n_de.n) where n_de.n > 2;
 n | word | n | wort
---+------+---+------
   |      | 3 | drei
(1 row)

-Kevin

pgsql-docs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Improve warnings around CREATE INDEX CONCURRENTLY
Next
From: "Ross J. Reedstrom"
Date:
Subject: Re: Documentation and explanatory diagrams