Re: Regression from 7.3 to 7.4 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Regression from 7.3 to 7.4
Date
Msg-id 13157.1081185350@sss.pgh.pa.us
Whole thread Raw
In response to Regression from 7.3 to 7.4  (Dennis Bjorklund <db@zigo.dhs.org>)
Responses Re: Regression from 7.3 to 7.4  (Dennis Bjorklund <db@zigo.dhs.org>)
Re: Regression from 7.3 to 7.4  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Dennis Bjorklund <db@zigo.dhs.org> writes:
> This testcase works in 7.3 but not in 7.4:

> create table t1 (a int);
> create table t2 (b int);
> select * from t1, (select b as a from t2 group by a) as foo;

> ERROR:  column "t2.b" must appear in the GROUP BY clause or be used in an 
> aggregate function

> I don't know if it's supposed to work to do a group by of an alias, but 
> why not and it used to work.

This example strikes me as a good reason why we ought to deprecate and
eventually remove the capability for GROUP BY to reference output-list
aliases.  This is not legal per SQL spec, but we have accepted it for a
few versions now (I think since about 6.5).

The reason for the change in behavior is that 7.4 is trying to be
helpful for some related errors.  The subselect "foo" is not allowed to
reference columns of t1.  In 7.3 you'd get a plain "not found" error:

regression=# select * from t1, (select a from t2) as foo;
ERROR:  Attribute "a" not found

where now you get

regression=# select * from t1, (select a from t2) as foo;
ERROR:  subquery in FROM may not refer to other relations of same query level

The problem is that this check is made after parsing the subquery, so if
the subquery is internally inconsistent you'll get the subquery-local
error first.  In your example the subquery thinks it has a "GROUP BY
outer-reference" and so the check for ungrouped columns fails.

This change was made partly because it seemed to yield more useful error
messages, and partly in anticipation of supporting the SQL99 LATERAL
feature, in which such references *are* legal.  So while reverting the
change is one possible answer, it doesn't seem forward-looking.

Another tweak we could make is to cause findTargetlistEntry() to look
only for local variable names before looking for targetlist alias
matches.  This would effectively change the precedence for resolving
"GROUP BY x" to be (1) x as a local variable, (2) x as a targetlist
alias, (3) x as an outer variable; whereas the present search order is
(1), (3), (2).  AFAICS this does not break compatibility with either
SQL92 or SQL99 because both of them allow only case (1).  However this
could break existing queries that are relying on the non-aliased
behavior.  (Offhand I can't think of a really good reason to GROUP BY
an outer reference, but maybe there is one.)

Or we could bite the bullet and deprecate/remove the alias-reference
feature.  I think I was the one who put it in originally for GROUP BY,
but in hindsight it was a terrible idea --- it's caused way more
confusion than it's worth.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Socket communication for contrib
Next
From: Jeff
Date:
Subject: Re: Socket communication for contrib