Thread: RE: [HACKERS] Enhancing PGSQL to be compatible with Informix SQL
>> > Yes, OUTER is an Informix-ism. Oracle uses *=. I think the first is >> > easier to add and makes more sense for us. *= could be defined by >> > someone as an operator, and overloading our already complex operator >> > code to do *= for OUTER may be too complex for people to understand. >> > >> > It would be: >> > >> > SELECT * >> > FROM tab1, OUTER tab2 >> > WHERE tab1.col1 = tab2.col2 >> >> What about >2 table joins? Wish I had my book here, but I though tyou >> could do multiple OUTER joins, no? Oracle uses a syntax which I quite like. The query above would become: SELECT * FROM tab, tab2 WHERE tab1.col1 = tab2.col2 (+) I've actually used queries something like this: SELECT blah, blah, blah FROM t1, t2, t3, t4 WHERE t1.start_date BETWEEN t2.start_date (+) AND t2.end_date (+) AND t1.y = t2.y (+) AND t3.x (+) = t1.x AND t3.y (+) = t1.y AND t4.x = t1.x; For example... I realise that this is not standard, but it's easy to read, and easy to develop. The problem with OUTER is: OUTER on which relationship? Does this matter? I haven't thought about it hugely, but it may not make sense when you try to do this: SELECT * FROM t1, OUTER t2, t3 WHERE t1.x = t2.x AND t2.y = t3.y Which is the OUTER join? Outer joining to t1 and inner joining to t3 gives (I think) a different result to inner joining to t1 and outer joining to t3. Then you have to start creating language rules to help determine which join becomes the outer join, and it becomes a bit of a mess. With Oracle's notation, it's pretty clear (I think anyway). Hope this adds some fuel to the process... MikeA
> SELECT blah, blah, blah > FROM t1, t2, t3, t4 > WHERE t1.start_date BETWEEN t2.start_date (+) AND t2.end_date (+) > AND t1.y = t2.y (+) > AND t3.x (+) = t1.x > AND t3.y (+) = t1.y > AND t4.x = t1.x; > > For example... > > I realise that this is not standard, but it's easy to read, and easy to > develop. > > The problem with OUTER is: OUTER on which relationship? Does this matter? > I haven't thought about it hugely, but it may not make sense when you try to > do this: > > SELECT * > FROM t1, OUTER t2, t3 > WHERE t1.x = t2.x > AND t2.y = t3.y > > Which is the OUTER join? Outer joining to t1 and inner joining to t3 gives > (I think) a different result to inner joining to t1 and outer joining to t3. > Then you have to start creating language rules to help determine which join > becomes the outer join, and it becomes a bit of a mess. With Oracle's > notation, it's pretty clear (I think anyway). This must be why the ANSI standard requires you to specify the join when doing outer. Thomas says we are going only with ANSI syntax, and I can see now why OUTER is just looking for problems. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
"Ansley, Michael" wrote: > > >> > Yes, OUTER is an Informix-ism. Oracle uses *=. I think the first is > >> > easier to add and makes more sense for us. *= could be defined by > >> > someone as an operator, and overloading our already complex operator > >> > code to do *= for OUTER may be too complex for people to understand. > >> > > >> > It would be: > >> > > >> > SELECT * > >> > FROM tab1, OUTER tab2 > >> > WHERE tab1.col1 = tab2.col2 > >> > >> What about >2 table joins? Wish I had my book here, but I though tyou > >> could do multiple OUTER joins, no? > > Oracle uses a syntax which I quite like. The query above would become: > > SELECT * > FROM tab, tab2 > WHERE tab1.col1 = tab2.col2 (+) > > I've actually used queries something like this: > > SELECT blah, blah, blah > FROM t1, t2, t3, t4 > WHERE t1.start_date BETWEEN t2.start_date (+) AND t2.end_date (+) > AND t1.y = t2.y (+) > AND t3.x (+) = t1.x > AND t3.y (+) = t1.y > AND t4.x = t1.x; > > For example... > > I realise that this is not standard, but it's easy to read, and easy to > develop. I completely agree that Oracle has got it in a very clear, readable and understandable way. When I used MS Access (supposedly ANSI) I always created the outer join queries using the graphical tool and also had to examine it using said tool, because all these LEFT OUTER JOIN ON .... introduced too much line noise for me to be able to understand what was actually meant. OTOH, just marking the "outer" side with (+) was easy both to to read and write. So I would very much like to have the Oracle syntax for outer joins as well. IMHO the ANSI standard (as anything designed by a committee) is not always the best way to do things. -------------- Hannu
At 11:08 PM 1/6/00 +0200, Ansley, Michael wrote: >>> What about >2 table joins? Wish I had my book here, but I though tyou >>> could do multiple OUTER joins, no? > >Oracle uses a syntax which I quite like. The query above would become: > >SELECT * >FROM tab, tab2 >WHERE tab1.col1 = tab2.col2 (+) > >I've actually used queries something like this: > >SELECT blah, blah, blah >FROM t1, t2, t3, t4 >WHERE t1.start_date BETWEEN t2.start_date (+) AND t2.end_date (+) >AND t1.y = t2.y (+) >AND t3.x (+) = t1.x >AND t3.y (+) = t1.y >AND t4.x = t1.x; Good...you saved me the trouble of digging out some examples from the code I'm porting, which occasionally due similar things :) I think the ANSI SQL 92 equivalent is something like: select ... from t1 inner join t4 on t1.x=t4.x, t2 left outer join t1 on t2.y=t1.y and (t1.start_date between t2.start_dateand t1.start_date), t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; I've never used an ANSI SQL 92 compliant RDBMS, I'm not sure if t2/t1 become ambiguous and need to be given different names using "as foo" in each case, etc. Actually, you would in order to build the target list unambiguously I guess... But that's the general gist. I think - Thomas, am I at all close? Of course, you can continue to write the inner join in the old way: select ... from t1 inner join t2 on t1.x=t2.x; and select ... from t1,t2 where t1.x=t2.x; where the last form of the inner join might be considered an optimization of a cross-join restricted by a boolean expression in the where clause rather than a proper inner join. In other words, the two queries return the same rows and one would be very disappointed if the second form formed the cartesian product of t1 and t2 and then filtered the resulting rows rather than do an inner join! - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 01:47 AM 1/7/00 +0200, Hannu Krosing wrote: >IMHO the ANSI standard (as anything designed by a committee) is not always the >best >way to do things. Well, generally standards committees don't design features. They're usually designed by one or two people, then submitted to the committee for discussion and eventual adoption or rejection. My understanding from reading Date is that one reason for not adopting the common vendor hacks for outer joins is that outer joins aren't associative and the result of complicated expressions in some cases will depend on the order in which the RDBMS' optimizer chooses to execute them. Putting on my compiler-writer hat, I can see where having joins explicitly declared in the "from" clauses rather than derived from an analysis of the "from" and "where" clauses might well simplify the development of a new SQL 92 RDBMS if one were to start from scratch. It's cleaner, IMO. This doesn't apply to Postgres, since the outer joins are being shoe-horned into existing data structures. Of course, I speak as someone without a lot of experience writing Oracle or Informix SQL. If you're used to Oracle, then it's not surprising you find its means of specifying an outer join the most natural and easiest to understand... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> select ... > from t1 inner join t4 on t1.x=t4.x, > t2 left outer join t1 > on t2.y=t1.y and > (t1.start_date between t2.start_date and t1.start_date), > t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; Let's be honest, folks. This is almost unreadable. I think we will need some simpler way to access _outer_ in addition to the ANSI way. I can't imagine how I would answer a question: "How do I do an ANSI outer join". It would need its own FAQ page. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Thu, 6 Jan 2000, Bruce Momjian wrote: > > select ... > > from t1 inner join t4 on t1.x=t4.x, > > t2 left outer join t1 > > on t2.y=t1.y and > > (t1.start_date between t2.start_date and t1.start_date), > > t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; > > Let's be honest, folks. This is almost unreadable. I think we will > need some simpler way to access _outer_ in addition to the ANSI way. > > I can't imagine how I would answer a question: "How do I do an ANSI > outer join". It would need its own FAQ page. How do the "books" talk about JOINs? What is the semi-standard syntax that is generally used in samples?
> >SELECT blah, blah, blah > >FROM t1, t2, t3, t4 > >WHERE t1.start_date BETWEEN t2.start_date (+) AND t2.end_date (+) > >AND t1.y = t2.y (+) > >AND t3.x (+) = t1.x > >AND t3.y (+) = t1.y > >AND t4.x = t1.x; > I think the ANSI SQL 92 equivalent is something like: > select ... > from t1 inner join t4 on t1.x=t4.x, > t2 left outer join t1 > on t2.y=t1.y and > (t1.start_date between t2.start_date and t1.start_date), > t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; Hmm. I'm not sure what the Oracle example actually gives as a result, and I find the syntax as confusing as others find SQL92 syntax ;) > I've never used an ANSI SQL 92 compliant RDBMS, I'm not sure > if t2/t1 become ambiguous and need to be given different names > using "as foo" in each case, etc. Actually, you would in > order to build the target list unambiguously I guess... Once two tables are mentioned in an "outer join", then individual columns can no longer be qualified by the original table names. Instead, you are allowed to put table and column aliases on the join expression: select a, b, c, z from (t1 left join t2 using (x)) as j1 (a, b, c) right join t3 on (j1.a = t3.y); (I think I have this right; I'm doing it from memory and have been away from it for a little while). - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
> > select ... > > from t1 inner join t4 on t1.x=t4.x, > > t2 left outer join t1 > > on t2.y=t1.y and > > (t1.start_date between t2.start_date and t1.start_date), > > t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; > Let's be honest, folks. This is almost unreadable. I think we will > need some simpler way to access _outer_ in addition to the ANSI way. Nonsense! Especially since this isn't quite SQL92. Here is an SQL92 query (I think ;) : select a, b, c from (t1 left join t2 using (x)) as j1 (a, b) right join t3 on (j1.a = t3.y); So you do a left join with t1 and t2, name the resulting intermediate table and columns, and then do a right join of the result with t3. I can't see other syntaxes being very much more obvious, particularly wrt predicting the actual result. Just because a query looks simpler doesn't necessarily mean that the syntax alway produces a more robust query. > I can't imagine how I would answer a question: "How do I do an ANSI > outer join". It would need its own FAQ page. Well, *you're* the one writing the book :)) - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
On Fri, 7 Jan 2000, Thomas Lockhart wrote: > > > select ... > > > from t1 inner join t4 on t1.x=t4.x, > > > t2 left outer join t1 > > > on t2.y=t1.y and > > > (t1.start_date between t2.start_date and t1.start_date), > > > t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; > > Let's be honest, folks. This is almost unreadable. I think we will > > need some simpler way to access _outer_ in addition to the ANSI way. > > Nonsense! Especially since this isn't quite SQL92. Here is an SQL92 > query (I think ;) : > > select a, b, c > from (t1 left join t2 using (x)) as j1 (a, b) > right join t3 on (j1.a = t3.y); > > So you do a left join with t1 and t2, name the resulting intermediate > table and columns, and then do a right join of the result with t3. I > can't see other syntaxes being very much more obvious, particularly > wrt predicting the actual result. Just because a query looks simpler > doesn't necessarily mean that the syntax alway produces a more robust > query. > This always strikes me as very much an each-to-his-own situation. I generally prefer the oracle syntax myself; whilst there are potential ambiguities (which oracle gets around by not executing ambiguous queries), it's cleaner to write. That said I don't particularly like SQL itself; if I wanted to program COBOL I'd get a COBOL compiler:). The SQL92 syntax is more of an SQLism than anything else, and the extra "english" words actually tend to obscure the details of the join. It certainly makes sense to use the SQL92 syntax; it is most important to be compatible with the standards that anything else, but I would still argue that a more straightforward syntax in parallel is probably worthwhile. > > I can't imagine how I would answer a question: "How do I do an ANSI > > outer join". It would need its own FAQ page. > > Well, *you're* the one writing the book :)) > I'd have thought this gave him justtification to complain about your horrible syntax then:) > - Thomas > > -- > Thomas Lockhart lockhart@alumni.caltech.edu > South Pasadena, California > .............................Rod +-----------------------------------------------------------------------------+ | Rod Chamberlin | rod@querix.com Tel +44 1703 232345 | | Software Engineer | Mob +44 7803 295406 | | QueriX | Fax +44 1703 399685 | +-----------------------------------------------------------------------------+ | The views expressed in this document do not necessarily represent those of | | the management of QueriX (UK) Ltd. | +-----------------------------------------------------------------------------+
> > > I can't imagine how I would answer a question: "How do I do an ANSI > > > outer join". It would need its own FAQ page. > > > > Well, *you're* the one writing the book :)) > > > > I'd have thought this gave him justtification to complain about your > horrible syntax then:) The big problem is that is no Thomas's syntax, but the ANSI syntax, and there doesn't seem to be any vendor-neutral solution for outer joins other than the ANSI one. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
At 12:00 AM 1/7/00 -0400, The Hermit Hacker wrote: >On Thu, 6 Jan 2000, Bruce Momjian wrote: > >> > select ... >> > from t1 inner join t4 on t1.x=t4.x, >> > t2 left outer join t1 >> > on t2.y=t1.y and >> > (t1.start_date between t2.start_date and t1.start_date), >> > t3 left outer join t1 on t3.x=t1.x and t3.y = t1.y; >> >> Let's be honest, folks. This is almost unreadable. I think we will >> need some simpler way to access _outer_ in addition to the ANSI way. Well...it took a minute to digest the Oracle version, too. Most joins are far simpler than the example. >How do the "books" talk about JOINs? What is the semi-standard syntax >that is generally used in samples? "SQL for smarties" gives examples of vendor-specific syntax then talks about outer joins more abstractly. It also points out that the existing vendor solutions have weaknesses. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 06:56 AM 1/7/00 +0000, Thomas Lockhart wrote: >Hmm. I'm not sure what the Oracle example actually gives as a result, >and I find the syntax as confusing as others find SQL92 syntax ;) Me too :) As I pointed out in an earlier message, fortunately most of the outer join examples I've seen are simpler, and more readable in either style. Thanks, BTW, for the status update, it's about what I gathered from looking at the code. >Once two tables are mentioned in an "outer join", then individual >columns can no longer be qualified by the original table names. >Instead, you are allowed to put table and column aliases on the join >expression: > >select a, b, c, z > from (t1 left join t2 using (x)) as j1 (a, b, c) > right join t3 on (j1.a = t3.y); > >(I think I have this right; I'm doing it from memory and have been >away from it for a little while). Yeah, I think this is right, I'd seen in the syntax where a general table reference can be a join and hadn't thought about being able to table alias the entire result. This is useful, actually. Without the column aliases something like: select j1.a, j1.b, j2.foo ... makes it clear as to which join a column comes from. This clarity's often lacking in the Oracle-style queries, as I've noticed when I decipher them during my port-to-Postgres work. You need to unwind what comes from where, and often have to look at the data model to figure it out if the names are unique to the different tables and not fully qualified as "table_name.column_name". - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.