Thread: Outer joins aren't working with views

Outer joins aren't working with views

From
pgsql-bugs@postgresql.org
Date:
Grzegorz Mucha (mucher@tigana.pl) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
Outer joins aren't working with views

Long Description
It seems outer joins are not working at all(they work as inner joins so far).
For example, see below:
(the result is identical for inner and outer join) - two rows fetched from db(as I recall, there should be one more row
havingt1.id=3) 

Sample Code
create table t1(id serial primary key);
create table t2(id2 serial primary key, id int);
insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
insert into t2 (id) values(1);
insert into t2 (id) values(2);
select t1.*, t2.* from t1 natural left outer join t2;

No file was uploaded with this report

Re: Outer joins aren't working with views

From
Tom Lane
Date:
It works for me:

regression=# select t1.*, t2.* from t1 natural left outer join t2;
 id | id2 | id
----+-----+----
  1 |   1 |  1
  2 |   2 |  2
  3 |     |
(3 rows)

What version are you using?

            regards, tom lane

Re: Outer joins aren't working with views

From
Stephan Szabo
Date:
What version are you using?  The sample code works for me
on current sources, three rows with the last one as 3|null|null


Stephan Szabo
sszabo@bigpanda.com

On Fri, 15 Dec 2000 pgsql-bugs@postgresql.org wrote:

> Grzegorz Mucha (mucher@tigana.pl) reports a bug with a severity of 2
> The lower the number the more severe it is.
>
> Short Description
> Outer joins aren't working with views
>
> Long Description
> It seems outer joins are not working at all(they work as inner joins so far).
> For example, see below:
> (the result is identical for inner and outer join) - two rows fetched from db(as I recall, there should be one more
rowhaving t1.id=3) 
>
> Sample Code
> create table t1(id serial primary key);
> create table t2(id2 serial primary key, id int);
> insert into t1 values (1);
> insert into t1 values (2);
> insert into t1 values (3);
> insert into t2 (id) values(1);
> insert into t2 (id) values(2);
> select t1.*, t2.* from t1 natural left outer join t2;
>
> No file was uploaded with this report
>

Re: Outer joins aren't working with views

From
Thomas Lockhart
Date:
> It works for me:
> regression=# select t1.*, t2.* from t1 natural left outer join t2;
>  id | id2 | id
> ----+-----+----
>   1 |   1 |  1

My recollection is that SQL9x requires that the join result lose the
link to the original table names. That is,

  select id, id2 from t1 natural left outer join t2;

is legal, but

  select t1.id, ...

is not.

If one needs to label the join product, then one uses aliases, as

  select tx.* from (t1 natural left outer join t2) as tx;

or

  select tx.* from (t1 natural left outer join t2) as tx (c1, c2);

I could see allowing in the target list explicit reference to the
underlying tables as an extension when there is no ambiguity.

However, in this case should the natural join in the original example do
the join on the column "id", and not have two columns of name "id"
available after the join?

How do you read the spec and this example? My original reading was from
the Date and Darwen book, and the SQL99 spec we have is (to put it
nicely) a bit harder to follow. I'll write some of this up for the
syntax section of the user's guide once I'm clear on it...

ref:
ansi-iso-9075-2-1999.txt from the draft documents we found on the web
last year.

ISO/IEC 9075-2:1999 SQL - Part 2: SQL/Foundation-
September 23, 1999
[Compiled using SQL3_ISO option]
section 7.7 rule 7

                         - Thomas

Table name scope (was Re: Outer joins aren't working with views)

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> It works for me:
>> regression=# select t1.*, t2.* from t1 natural left outer join t2;
>> id | id2 | id
>> ----+-----+----
>> 1 |   1 |  1

> My recollection is that SQL9x requires that the join result lose the
> link to the original table names. That is,
>   select id, id2 from t1 natural left outer join t2;
> is legal, but
>   select t1.id, ...
> is not.

Hm.  This is one of the areas that I had put down on my personal TODO
list as needing a close look before release.  So, let's get to it.

My first scan of SQL92 looks like our current behavior is right.
I find these paras that seem to be relevant to the scope of a
<correlation name> (ie, a table alias):

5.4 Names and identifiers, syntax rule 12:
        12)An <identifier> that is a <correlation name> is associated with           a table within a particular scope.
Thescope of a <correlation           name> is either a <select statement: single row>, <subquery>, or           <query
specification>(see Subclause 6.3, "<table reference>").           Scopes may be nested. In different scopes, the same
<correlation          name> may be associated with different tables or with the same           table.
 

6.3 <table reference>, syntax rule 2:
        2) Case:
           a) If a <table reference> TR is contained in a <from clause> FC             with no intervening <derived
table>,then the scope clause             SC of TR is the <select statement: single row> or innermost             <query
specification>that contains FC. The scope clause of             the exposed <correlation name> or exposed <table name>
ofTR             is the <select list>, <where clause>, <group by clause>, and             <having clause> of SC,
togetherwith the <join condition> of             all <joined table>s contained in SC that contains TR.
 
           b) Otherwise, the scope clause SC of TR is the outermost <joined             table> that contains TR with no
intervening<derived table>.             The scope of the exposed <correlation name> or exposed <table             name>
ofTR is the <join condition> of SC and of all <joined             table>s contained in SC that contain TR.
 

(Note that <derived table> means subselect-in-FROM, cf 6.3 and 7.11.)

The first and second items here seem to be perfectly clear that the
names t1 and t2 have scope across the whole SELECT statement and are not
hidden within the <joined table> formed by the OUTER JOIN clause.

On the other hand, the third item leaves me confused again.  I don't
see how it applies at all, ie, when is the "If" of 2(a) ever false?
How is it *possible* to have a <table reference> that's not directly
contained in a <from clause>?  The business about a <derived table>
seems like horsepucky, because a table ref inside a subselect would be
contained in the subselect's from-clause and its scope would be that
subselect.  Where in the spec does it allow a table reference that's
not in a from-clause?  (Our PostQuel extensions do not count ;-))

It'd be useful to check the above example against Oracle and other
implementations, but the parts of the spec that I can follow seem
to say that we've got the right behavior now.
        regards, tom lane


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Thomas Lockhart
Date:
> The first and second items here seem to be perfectly clear that the
> names t1 and t2 have scope across the whole SELECT statement and are not
> hidden within the <joined table> formed by the OUTER JOIN clause.

You are right. If there is a "correlation name", then those underlying
table names become invisible, but that was not in the example here.
Rereading my Date and Darwen clarified this for me. However, there are
*some* columns for which this explicit table qualification is not
allowed, including in the example of NATURAL JOIN.

Date and Darwen, 4th ed, pp 142-144 discuss various aspects of join
scope and behavior. For NATURAL JOIN, the columns with common names
forming the join columns *lose* their underlying table name, since they
can't be traced back to a column from a specific table (the table of
origin is ambiguous). And for a NATURAL JOIN, it is impossible to get
back two columns with the same name, since those columns were unified by
the join process.

The process is required to join on the columns with names in common, and
to swallow one of each pair in the result. How should you refer to the
column that remains?
 create table t1 (id int, id2 int); create table t2 (id int, name text); select * from t1 natural left outer join t2;

must return something from the set of columns (id, id2, name), and two
columns of name "id" will not be visible. Also, column "id" cannot be
qualified with a table name. So
 select t1.id from t1 natural join t2;

is not legal (though perhaps could be justified as an extension). The
columns *not* involved in the join operation, id2 and name, *can* be
qualified by the underlying table name, but the only way to get the same
for "id" after the natural join is to use a correlation name. e.g.
 select tx.id from (t1 natural join t2) as tx; select t1.id2 from t1 natural join t2;

are both legal.

> It'd be useful to check the above example against Oracle and other
> implementations, but the parts of the spec that I can follow seem
> to say that we've got the right behavior now.

Oracle does not support SQL9x join syntax, so we can't ask it for an
example. Not sure about the others.

Comments?
                      - Thomas


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> The first and second items here seem to be perfectly clear that the
>> names t1 and t2 have scope across the whole SELECT statement and are not
>> hidden within the <joined table> formed by the OUTER JOIN clause.

> You are right. If there is a "correlation name", then those underlying
> table names become invisible, but that was not in the example here.

Right, either the table's real name or its alias ("correlation name") is
introduced into the query's scope, not both.  AFAICT the scope rules
are the same for either one, though.

> Rereading my Date and Darwen clarified this for me. However, there are
> *some* columns for which this explicit table qualification is not
> allowed, including in the example of NATURAL JOIN.

I disagree on that.  The table's real/alias name is certainly supposed
to be accessible, and I see nothing in the spec that says that only some
of its columns are accessible via qualification.  What the spec does say
is that the *output* of the join has only one copy of the joined column.
In other words, given table A with columns ID and CA, and table B with
columns ID and CB, I believe the correct behavior is

SELECT * FROM (A NATURAL JOIN B) J        produces ID, CA, CB

SELECT J.* FROM (A NATURAL JOIN B) J        produces ID, CA, CB

SELECT A.* FROM (A NATURAL JOIN B) J        produces ID, CA

SELECT B.* FROM (A NATURAL JOIN B) J        produces ID, CB

If it's an outer join then J.ID is subtly different from A.ID and/or
B.ID --- the spec defines the output column as COALESCE(A.ID,B.ID)
(cf SQL92 7.5 <joined table>, syntax rule 6.d) to get rid of introduced
nulls.  BTW, our implementation simplifies that to A.ID for an inner or
left join, or B.ID for a right join, and only uses the full COALESCE
expression for a full join.

Anyway, I believe it's true that you can't get at A.ID or B.ID in
this example except by qualifying the column name with the table name
--- but I don't see where it says that you shouldn't be able to get
at them at all.  If that were true then the definition in 7.5.6.d
wouldn't be legal, because that's exactly the syntax it uses to define
the joined column.

> Date and Darwen, 4th ed, pp 142-144 discuss various aspects of join
> scope and behavior. For NATURAL JOIN, the columns with common names
> forming the join columns *lose* their underlying table name, since they
> can't be traced back to a column from a specific table (the table of
> origin is ambiguous).

My reading is that the output columns are qualified with the JOIN
clause's correlation name, if any (J in my example).  If you didn't
bother to stick a correlation name on the join clause, you couldn't
refer to them with a qualified name.

In an example like

SELECT * FROM (A NATURAL LEFT JOIN (B NATURAL FULL JOIN C));

supposing that all three tables have a column ID, then the output ID
column of the B/C join has no qualified name, and it would indeed be
impossible to refer to it from the SELECT list.  The only IDs accessible
from the SELECT list are the also-qualified-name-less output of the
left join and A.ID, B.ID, C.ID, none of which are quite the same as
the output of the full join.  Perhaps what Date and Darwen are talking
about is cases like this? 
        regards, tom lane


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Thomas Lockhart
Date:
> I disagree on that.  The table's real/alias name is certainly supposed
> to be accessible, and I see nothing in the spec that says that only some
> of its columns are accessible via qualification.

Date and Darwen disagree circa 1997, and I believe that SQL99 does not
radically alter the spec in this regard. All of my interpretations below
are based on D&D, not the draft spec we have available (though I look to
that to support their interpretation, which imho it does).

> What the spec does say
> is that the *output* of the join has only one copy of the joined column.
> In other words, given table A with columns ID and CA, and table B with
> columns ID and CB, I believe the correct behavior is
> 
> SELECT * FROM (A NATURAL JOIN B) J              produces ID, CA, CB

Yes.

> SELECT J.* FROM (A NATURAL JOIN B) J            produces ID, CA, CB

Yes.

> SELECT A.* FROM (A NATURAL JOIN B) J            produces ID, CA

No, since there is a range variable J, no columns explicitly qualified
with A or B are visible. If the range variable J is omitted, then the
result will produce only CA. See one of the D&D cases I include below.

> SELECT B.* FROM (A NATURAL JOIN B) J            produces ID, CB

Same as for the previous case. B.* is not visible since a range variable
is specified, and if J is not there then B.* produces CB only.

> If it's an outer join then J.ID is subtly different from A.ID and/or
> B.ID --- the spec defines the output column as COALESCE(A.ID,B.ID)
> (cf SQL92 7.5 <joined table>, syntax rule 6.d) to get rid of introduced
> nulls.  BTW, our implementation simplifies that to A.ID for an inner or
> left join, or B.ID for a right join, and only uses the full COALESCE
> expression for a full join.

Right, the result is the same for these cases. The only issue is the
scoping on the name allowed for external reference.

> Anyway, I believe it's true that you can't get at A.ID or B.ID in
> this example except by qualifying the column name with the table name
> --- but I don't see where it says that you shouldn't be able to get
> at them at all.  If that were true then the definition in 7.5.6.d
> wouldn't be legal, because that's exactly the syntax it uses to define
> the joined column.

7.7.7.d seems to define SLCC pretty clearly, without a table name
qualification. I think that this is consistant with D&D's
interpretation.

> > Date and Darwen, 4th ed, pp 142-144 discuss various aspects of join
> > scope and behavior. For NATURAL JOIN, the columns with common names
> > forming the join columns *lose* their underlying table name, since they
> > can't be traced back to a column from a specific table (the table of
> > origin is ambiguous).
> My reading is that the output columns are qualified with the JOIN
> clause's correlation name, if any (J in my example).  If you didn't
> bother to stick a correlation name on the join clause, you couldn't
> refer to them with a qualified name.

Sure. But without a correlation name, you are not allowed to qualify
with the underlying table name for "join columns" from NATURAL or JOIN
ON joins. See below...

> In an example like
> 
> SELECT * FROM (A NATURAL LEFT JOIN (B NATURAL FULL JOIN C));
> 
> supposing that all three tables have a column ID, then the output ID
> column of the B/C join has no qualified name, and it would indeed be
> impossible to refer to it from the SELECT list.  The only IDs accessible
> from the SELECT list are the also-qualified-name-less output of the
> left join and A.ID, B.ID, C.ID, none of which are quite the same as
> the output of the full join.  Perhaps what Date and Darwen are talking
> about is cases like this?

No, they are talking about simpler cases, and very clearly they disagree
with the current behavior of the PostgreSQL parser. Now, it may be that
SQL99 has changed the scoping rules for these cases, but instead I would
look for support for Date and Darwen's interpretation in the spec,
rather than reading the spec from first principles. Date and Darwen can
explain it in a couple of pages, and give examples, where the spec is
just way too convoluted for a clear reading istm.

Anyway, here are two cases discussed by D&D -- note that table sp has
columns (sno, pno, qty) and table s has columns (sno, sname, status,
city, primary):

(p142, after a discussion of other cases)
"One very counterintuitive consequence of this unorthodox scoping rule
is illustrated by the following example: The result of the expression
 select distinct sp.* from sp natural join s;

will include columns PNO and QTY but *not* column SNO, because --
believe it or not -- there is no column "SP.SNO" in the result of the
join expression (indeed specifying SP.SNO in the SELECT clause would be
a syntax error)."

The emphasis is D&D's, not mine ;) For natural joins, or other joins
where two columns are subsumed into one (anything with a USING clause?)
the scoping rules are clear, at least to D&D: it is not possible to
reference one of these columns by qualifying with the name of an
underlying table.


Another case (p143-144, following some simpler cases which show how
scoping progresses through more deeply nested joins):
"Now let us modify the example once again to introduce an explicit range
variable TC for the overall result:
 ( ( T1 JOIN T2 ON cond-1 ) AS TA  JOIN   ( T3 JOIN T4 ON cond-2 ) AS TB     ON cond-3 ) AS TC

The rules are now as follows:

cond-1 can reference T1 and T2 but not T3, T4, TA, TB, or TC
cond-2 can reference T3 and T4 but not T1, T2, TA, TB, or TC
cond-3 can reference TA and TB but not T1, T2, T3, T4, or TC

and (once again) if the overall expression appears as the operand of a
FROM clause, then the associated SELECT clause, WHERE clause, etc. can
reference TC but not T1, T2, T3, T4, TA, or TB."

So the two D&D cases cited above illustrate the "with range variables"
and "without range variables" expected behavior. Comments?
                           - Thomas


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> (p142, after a discussion of other cases)
> "One very counterintuitive consequence of this unorthodox scoping rule
> is illustrated by the following example: The result of the expression

>   select distinct sp.* from sp natural join s;

> will include columns PNO and QTY but *not* column SNO, because --
> believe it or not -- there is no column "SP.SNO" in the result of the
> join expression (indeed specifying SP.SNO in the SELECT clause would be
> a syntax error)."

> The emphasis is D&D's, not mine ;)

Hm.  After further digging in the spec, it seems that their
interpretation rests on SQL92's section 6.4 <column reference> syntax
rule 2.b.  Rule 2 in full is:
        2) If CR contains a <qualifier> Q, then CR shall appear within the           scope of one or more <table name>s
or<correlation name>s that           are equal to Q. If there is more than one such <table name> or
<correlationname>, then the one with the most local scope is           specified. Let T be the table associated with
Q.
           a) T shall include a column whose <column name> is CN.
           b) If T is a <table reference> in a <joined table> J, then CN             shall not be a common column name
inJ.
 
             Note: Common column name is defined in Subclause 7.5, "<joined             table>".

2.b strikes me as a completely unnecessary and counterintuitive
restriction.  Do D&D provide any justification for it?  I'm not
especially inclined to make our implementation substantially more
complex in order to enforce what seems a bogus restriction.

What's even more interesting is that I can find no equivalent
text in SQL99.
        regards, tom lane


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Thomas Lockhart
Date:
> > (p142, after a discussion of other cases)
> > "One very counterintuitive consequence of this unorthodox scoping rule
> > is illustrated by the following example: The result of the expression
> 
> >   select distinct sp.* from sp natural join s;
> 
> > will include columns PNO and QTY but *not* column SNO, because --
> > believe it or not -- there is no column "SP.SNO" in the result of the
> > join expression (indeed specifying SP.SNO in the SELECT clause would be
> > a syntax error)."
> 
> > The emphasis is D&D's, not mine ;)
> 
> Hm.  After further digging in the spec, it seems that their
> interpretation rests on SQL92's section 6.4 <column reference> syntax
> rule 2.b.  Rule 2 in full is:
> 
>          2) If CR contains a <qualifier> Q, then CR shall appear within the
>             scope of one or more <table name>s or <correlation name>s that
>             are equal to Q. If there is more than one such <table name> or
>             <correlation name>, then the one with the most local scope is
>             specified. Let T be the table associated with Q.
> 
>             a) T shall include a column whose <column name> is CN.
> 
>             b) If T is a <table reference> in a <joined table> J, then CN
>               shall not be a common column name in J.
> 
>               Note: Common column name is defined in Subclause 7.5, "<joined
>               table>".
> 
> 2.b strikes me as a completely unnecessary and counterintuitive
> restriction.  Do D&D provide any justification for it?  I'm not
> especially inclined to make our implementation substantially more
> complex in order to enforce what seems a bogus restriction.

Hmm. istm that the D&D interpretation is entirely clear, and that for
NATURAL and USING joins there is no other way to carry along join
results as intermediate "tables". If
 select * from t1 natural join t2;

produces, say, three columns, how can any other specification of the
target list using only wildcards produce *more* columns? In particular,
how can
 select t1.*, t2.* from t1 natural join t2;

produce columns from t1 and t2 which are *not present* in the join "t1
natural join t2"?

> What's even more interesting is that I can find no equivalent
> text in SQL99.

Of course. When they bloated the spec by a factor of three or four, they
had to leave out the clear parts to save space ;)

I'm pretty sure that the sections I quoted (in 7.7.7 in the draft
document I have --  hopefully the same as what you have available?)
cover this topic. In particular, NATURAL and USING joins are not the
same as other inner or outer joins in the resulting set of available
columns. So there are two issues here which I hope to clarify: scoping
on joins, and NATURAL and USING join column sets.
                       - Thomas


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> In particular, how can

>   select t1.*, t2.* from t1 natural join t2;

> produce columns from t1 and t2 which are *not present* in the join "t1
> natural join t2"?

Very easily ;-)

>> What's even more interesting is that I can find no equivalent
>> text in SQL99.

> Of course. When they bloated the spec by a factor of three or four, they
> had to leave out the clear parts to save space ;)

Or they realized they blew it the first time.

> I'm pretty sure that the sections I quoted (in 7.7.7 in the draft
> document I have --  hopefully the same as what you have available?)
> cover this topic. In particular, NATURAL and USING joins are not the
> same as other inner or outer joins in the resulting set of available
> columns.

There's no question about what happens as far as the output of the join
is concerned.  However, 7.7.7 does not say word one about what is
implied by direct access (ie, qualified-name access) to the component
tables of the join.

I've been through the SQL99 draft again, and there is quite clearly NOT
any restriction corresponding to the old 6.4.2.b; so under SQL99 it is
legal to refer to A.ID and B.ID.  However, they do still have the idea
that A.* should omit ID: 7.11 <query specification> syntax rule 7.g.i
(concerning expansion of qualified asterisks) says
             i) If the basis is a <table or query name> or <correlation                name>, then let TQ be the table
associatedwith the basis.                The <select sublist> is equivalent to a <value expression>
sequencein which each <value expression> is a column                reference CR that references a column of TQ that is
not               a common column of a <joined table>. Each column of TQ                that is not a referenced common
columnshall be referenced                exactly once. The columns shall be referenced in the                ascending
sequenceof their ordinal positions within TQ.
 

which is essentially taken from 7.9.4 of the old spec.  This is a mess;
I wonder if the discrepancy between qualified-name access and asterisk
expansion is deliberate?  (Perhaps they felt that allowing qualified
name access was an extension that wouldn't break old code, but that they
couldn't change the asterisk expansion rule without breaking backwards
compatibility?)  It'd be nice to see if this is still true in SQL99
final.

> So there are two issues here which I hope to clarify: scoping
> on joins, and NATURAL and USING join column sets.

Two issues?  I thought we were only arguing about the latter one.
        regards, tom lane


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Tom Lane
Date:
I said:
>                  The <select sublist> is equivalent to a <value expression>
>                  sequence in which each <value expression> is a column
>                  reference CR that references a column of TQ that is not
>                  a common column of a <joined table>.

> which is essentially taken from 7.9.4 of the old spec.  This is a mess;

In fact, after looking at it again, I realize that the quoted text is
*wrong*, because it does not say what they presumably intended.  As
written, it appears thatSELECT J.* FROM (A NATURAL JOIN B) J
should omit the common column(s).  They're common columns of a <joined
table>, aren't they?

A lawyer would probably point out that 7.7 does not define the phrase
"common column".  It defines "common column name".  Common column name
clearly applies to all three tables involved (both input tables and the
output table), but it's not so clear whether "common column" is intended
to do so.

One could also wonder about the intended behavior of multi-level joins.
Does a column of a base table become inaccessible if it is used as a
common column several JOIN levels up?

At best, this part of the spec is extremely poorly written.
        regards, tom lane


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Thomas Lockhart
Date:
> > So there are two issues here which I hope to clarify: scoping
> > on joins, and NATURAL and USING join column sets.
> Two issues?  I thought we were only arguing about the latter one.

Well, I prefer to consider it "discussing" ;)

And there are two issues. I'll bet lunch and dinner that SQL99 did *not*
make radical changes in the scoping rules for join syntax vis a vis
SQL92. Certainly something compatible with SQL92 should have a shot at
being also compatible under SQL99, and scoping rules would fall into
that category.

On the second topic, NATURAL and USING join column sets, I believe that
it *must* be true that the set of columns available in a natural join
result (e.g. the result of
 A NATURAL JOIN B

) is the complete set of columns available to a SELECT target list, to a
WHERE qualification, etc. D&D's description of the effects of this
"interpretation" are consistant and clear (where the spec is not). I'm
not sure how we can allow our interpretation to be at odds with the
SQL92 spec or with a reading of the SQL99 draft I have available. In
particular, the rules for forming join results seem to cover the cases
we are discussing, and I read them as being consistant with D&D's SQL92
discussion. btw, their appendix on the upcoming "SQL3" does not bring up
join results or join scoping as among the changes in the upcoming
standard, though of course that is not a definitive point.

Date and Darwen have imho a very clear description of the scoping
allowed in join syntax. That scoping discussion says very clearly that a
"range variable" (SQL9x "correlation name") becomes the only allowed
qualification to a column name in SELECT target lists, WHERE clauses,
etc etc. They have very specific examples to clarify the point. And they
deem that necessary because the spec is a PITA to wade through. I'd
rather leave it to them to do the wading ;)

Let's look for counterexamples in our other texts if you are really
uncomfortable with the SQL92 (and SQL99?) result in D&D. I have another
book or two, and will look through them tonight. Does anyone else want
to jump in, esp. if you have experience with the SQL9x conventions or
have access to a db which already implements it?
                    - Thomas


Re: Table name scope (was Re: Outer joins aren't working with views)

From
Tom Lane
Date:
>>>> So there are two issues here which I hope to clarify: scoping
>>>> on joins, and NATURAL and USING join column sets.

I've been looking some more at this business, and I have found one of
the reasons that I was confused.  The SQL92 spec says (6.3 syntax rule
2)
        2) Case:
           a) If a <table reference> TR is contained in a <from clause> FC             with no intervening <derived
table>,then the scope clause             SC of TR is the <select statement: single row> or innermost             <query
specification>that contains FC. The scope clause of             the exposed <correlation name> or exposed <table name>
ofTR             is the <select list>, <where clause>, <group by clause>, and             <having clause> of SC,
togetherwith the <join condition> of             all <joined table>s contained in SC that contains TR.
 
           b) Otherwise, the scope clause SC of TR is the outermost <joined             table> that contains TR with no
intervening<derived table>.             The scope of the exposed <correlation name> or exposed <table             name>
ofTR is the <join condition> of SC and of all <joined             table>s contained in SC that contain TR.
 

I mistakenly read this with the assumption that <derived table> means
a sub-SELECT.  It does mean that, but it also means a <joined table>,
*if and only if* that joined table is labeled with a <correlation name>.
The relevant productions are:
        <table reference> ::=               <table name> [ [ AS ] <correlation name>                   [ <left paren>
<derivedcolumn list> <right paren> ] ]             | <derived table> [ AS ] <correlation name>                   [
<leftparen> <derived column list> <right paren> ]             | <joined table>
 
        <derived table> ::= <table subquery>
        <table subquery> ::= <subquery>
        <subquery> ::= <left paren> <query expression> <right paren>
        <query expression> ::=               <non-join query expression>             | <joined table>

So "(<joined table>) AS foo" has a <subquery> but "<joined table>" doesn't.
AFAICT, this means that table references defined within the join are
invisible outside "(<joined table>) AS foo", but they are visible
outside a plain "<joined table>".  This is more than a tad bizarre
... but it explains the examples you quoted from Date and Darwen.

However, as long as a table reference is visible, I think that the
set of qualified column names available from it should not depend on
whether it came from inside a JOIN expression or not.  Comments?
        regards, tom lane