Thread: GROUP BY + join regression in 7.3
I have just noticed that 7.3 and CVS tip reject a query that was accepted in earlier releases: regression=# create table t1(f1 int, f2 int); CREATE TABLE regression=# create table t2(f2 int, f3 int); CREATE TABLE regression=# select t1.f1 from t1 join t2 on (t1.f2=t2.f2) group by f1; ERROR: Attribute t1.f1 must be GROUPed or used in an aggregate function The difficulty here is that in 7.3, the unqualified name "f1" is treated as a Var of the unnamed JOIN relation, which is not equal() to "t1.f1". It's possible that we could avoid associating Vars with the JOIN relation in the case of JOINs that have no alias, but I'm not convinced this would work; it took a couple of iterations to get to the present representation of JOIN variables and I no longer recall all the reasons why simpler representations failed :-( What I'm thinking of doing is applying flatten_join_alias_vars() in parse_agg.c before checking for invalid grouping --- that replaces "f1" with "t1.f1" so that the above works. But first, I thought I'd ask if anyone thinks the above error message is correct behavior? The SQL spec seems quite unspecific about this --- it says that the SELECT-list item must be "a reference to a grouping column", but I couldn't find anywhere that specifically said whether "f1" and "t1.f1" are to be considered the same column in this scenario. There are related cases in which they are definitely not the same thing, egselect ... from t1 right join t2 using (f1) "f1" will never be NULL in the result of this FROM clause, but "t1.f1" could be NULL. ("f1" would really be "t2.f1" here.) In the plain inner-join case, there isn't any semantic difference between "f1" and "t1.f1", but does that mean we should treat them as the same grouping column? I'm not sure. Anyone want to try these cases on other DBMSes? regards, tom lane
Tom Lane wrote: > I have just noticed that 7.3 and CVS tip reject a query that was > accepted in earlier releases: > > regression=# create table t1(f1 int, f2 int); > CREATE TABLE > regression=# create table t2(f2 int, f3 int); > CREATE TABLE > regression=# select t1.f1 from t1 join t2 on (t1.f2=t2.f2) group by f1; > ERROR: Attribute t1.f1 must be GROUPed or used in an aggregate function [...snip...] > In the plain inner-join case, there isn't any semantic difference > between "f1" and "t1.f1", but does that mean we should treat them > as the same grouping column? I'm not sure. Anyone want to try these > cases on other DBMSes? > I did the above on MSSQL2000 -- it works with no error. I haven't been able to get at a development Oracle db, but I'll try to tomorrow. Joe
Joe Conway <mail@joeconway.com> writes: > I did the above on MSSQL2000 -- it works with no error. Just out of curiosity --- does MSSQL treat "f1" and "t1.f1" as different in the RIGHT JOIN variant case I mentioned? Previous reports have led me to have a very low estimate of their [understanding of/willingness to enforce] subtle distinctions ... regards, tom lane
Tom Lane wrote: > Just out of curiosity --- Here's what I get: select * from t1 f1 f2 ----------- ----------- 1 1 2 2 (2 row(s) affected) select * from t2 f2 f3 ----------- ----------- 1 3 (1 row(s) affected) select t1.f2 from t2 right join t1 on t1.f2 = t2.f2 group by f2 Server: Msg 209, Level 16, State 1, Line 1 Ambiguous column name 'f2'. select f2 from t2 right join t1 on t1.f2 = t2.f2 group by t1.f2 Server: Msg 209, Level 16, State 1, Line 1 Ambiguous column name 'f2'. select f2 from t2 right join t1 on t1.f2 = t2.f2 group by f2 Server: Msg 209, Level 16, State 1, Line 1 Ambiguous column name 'f2'. Server: Msg 209, Level 16, State 1, Line 1 Ambiguous column name 'f2'. select t1.f2 from t2 right join t1 on t1.f2 = t2.f2 group by t1.f2 f2 ----------- 1 2 (2 row(s) affected) Any other variations you want to see? Joe
On Tue, 01 Apr 2003 00:29:46 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Just out of curiosity --- does MSSQL treat "f1" and "t1.f1" as different >in the RIGHT JOIN variant case I mentioned? MSSQL7: SELECT t1.f1 FROM t1 {INNER | LEFT | RIGHT} JOIN t2 ON (t1.f2=t2.f2) GROUP BY f1 all run without an error. ORACLE7: JOIN syntax not available, but SELECT t1.f1 FROM t1, t2 WHERE (t1.f2=t2.f2) GROUP BY f1; SELECT t1.f1 FROM t1, t2WHERE t1.f2=t2.f2(+) GROUP BY f1; SELECT t1.f1 FROM t1, t2 WHERE t1.f2(+)=t2.f2 GROUP BY f1; all work. ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > MSSQL7: > SELECT t1.f1 > FROM t1 {INNER | LEFT | RIGHT} JOIN t2 ON (t1.f2=t2.f2) > GROUP BY f1 > all run without an error. > ORACLE7: JOIN syntax not available, but > SELECT t1.f1 FROM t1, t2 WHERE (t1.f2=t2.f2) GROUP BY f1; > SELECT t1.f1 FROM t1, t2 WHERE t1.f2=t2.f2(+) GROUP BY f1; > SELECT t1.f1 FROM t1, t2 WHERE t1.f2(+)=t2.f2 GROUP BY f1; > all work. Okay, that seems to pretty much prove the point. I'll fix parse_agg. regards, tom lane