Thread: names in WHERE and HAVING

names in WHERE and HAVING

From
Zachary Beane
Date:
I was reading the docs about SELECT and I came across this:

name

      Specifies another name for a column or an expression using the
      AS clause.  This name is primarily used to label the column for
      display. It can also be used to refer to the column's value in
      ORDER BY and GROUP BY clauses. But the name cannot be used in
      the WHERE or HAVING clauses; write out the expression instead.

What is the reason for this restriction? It would be a great
convenience if it were lifted.

I have several queries against another database system that do
something like this:

   select name, count(*) as score
   from mytable
   group by name
   having score > 10

PostgreSQL doesn't like them...

Zach
--
xach@xach.com     Zachary Beane     http://www.xach.com/

Re: names in WHERE and HAVING

From
Tom Lane
Date:
Zachary Beane <xach@xach.com> writes:
>       AS clause.  This name is primarily used to label the column for
>       display. It can also be used to refer to the column's value in
>       ORDER BY and GROUP BY clauses. But the name cannot be used in
>       the WHERE or HAVING clauses; write out the expression instead.

> What is the reason for this restriction?

According to the letter of the SQL spec, ORDER BY is the *only* one
of these clauses where output-column names can be used; we're violating
the spec even to allow output columns in GROUP BY.  The spec is written
that way because it has a very clear model of computation for SELECT:

    1. Select raw rows from source tables (possibly joined)
    2. Eliminate rows that fail the WHERE clause
    3. Perform grouping/aggregation, if any is specified
    4. Eliminate (grouped) rows that fail the HAVING clause
    5. Compute output expressions (SELECT list)
    6. Sort according to ORDER BY, if any
    7. Emit result

According to the spec, output expressions can't be used in WHERE, GROUP
BY, or HAVING because they haven't been computed yet.

We actually extend the spec somewhat for ORDER BY and GROUP BY, because
we accept either a bare output column name or an expression using input
column names for both of 'em; the spec countenances *only* bare output
column names for ORDER BY, *only* bare input column names for GROUP BY.

This extension already creates confusion and ambiguity --- which has to
be resolved differently in the two clauses to be compatible with the
cases that are required by the spec.  For example, if table foo has
column bar, consider
    SELECT trunc(-bar/2) AS bar, count(*)
    FROM foo GROUP BY bar ORDER BY bar
The spec says that this query is legal and must be interpreted as
grouping by the source column bar (which produces different results
than grouping by the output column named bar) and then ordered by the
output column bar (again, not the same result as the other choice).

Extending this ambiguity even further to allow expressions involving
output column names would be a mistake IMHO.  I'm not that eager to
propagate it into WHERE and HAVING, either --- but there's relatively
little use in allowing output-column references in WHERE and HAVING
unless they can be inside expressions.

> I have several queries against another database system that do
> something like this:

I'd be interested to know whether your unnamed other database meets
the letter of the spec on ambiguous cases like the above...

            regards, tom lane