Thread: BUG #4173: Illogical error message with aggregates + order by

BUG #4173: Illogical error message with aggregates + order by

From
"Chris West (Faux)"
Date:
The following bug has been logged online:

Bug reference:      4173
Logged by:          Chris West (Faux)
Email address:      pfx-psql@goeswhere.com
PostgreSQL version: 8.3.1
Operating system:   linux (debian lenny)
Description:        Illogical error message with aggregates + order by
Details:

-- Prelim SQL:

  create temporary table foo ( bar integer );
  insert into foo (bar) values (3),(1),(2);

-- Simple example:

Consider the following SQL:

  select count(bar) from foo order by bar;

For this input, both postgresql-8.2* and -8.3.* output:

ERROR:  column "foo.bar" must appear in the GROUP BY clause or be used in an
aggregate function

This is misleading as COUNT() is clearly[0] an aggregate function, and I
believe the SQL to be valid (although I could not provide a standard
reference); MySQL and SQLite (I'm sorry) both accept it. The situation
probably does not arise in these servers, however.

-- Motivating example:

Using the aggregate array_accum from the documentation[1]:

  CREATE AGGREGATE array_accum (
    sfunc = array_append,
    basetype = anyelement,
    stype = anyarray,
    initcond = '{}'
  );

This creates an array of all the values accumulated into it; for instance:

  select array_accum(bar) from foo

Produces:

  {3,1,2}

The following is invalid, as above, but it is the code that I would expect
to run to recieve {1,2,3}:

  select array_accum(bar) from foo order by bar;

The result of using group by, just in case it's not immediately obvious to
everyone (ha ha):

  select array_accum(bar) from foo group by bar order by bar;

Is:

  {1}
  {2}
  {3}

i.e. Not intended (but correct).

-- Workaround:

The only alternative implementation of this I can think of that works in the
general case (the array above, for example, could be externally sorted; this
would not work if the output array were to be sorted in relation to a
different column from "foo"), is something of the form:

  select array_accum(bar) from
    (select bar from foo order by bar) as pony

This is slow (~5 times slower on my real data) when the subquery will then
require a WHERE clause.

-- Summary

In summary, at least the error message is wrong, and I strongly believe that
the error condition itself is wrong.

--

gpg@goeswhere.com: 0xA482EE24; fingerprint:
34F5 5032 D173 76AA 0412 6117 7835 5BD4 A482 EE24

[0] http://www.postgresql.org/docs/8.3/static/functions-aggregate.html

[1] http://www.postgresql.org/docs/8.3/static/xaggr.html

Re: BUG #4173: Illogical error message with aggregates + order by

From
Tom Lane
Date:
"Chris West (Faux)" <pfx-psql@goeswhere.com> writes:
> Consider the following SQL:

>   select count(bar) from foo order by bar;

> For this input, both postgresql-8.2* and -8.3.* output:

> ERROR:  column "foo.bar" must appear in the GROUP BY clause or be used in an
> aggregate function

It's complaining about the occurrence of "bar" in ORDER BY, which is not
within an aggregate function.

I'm not sure that we can do much with the wording of the message.
What would be more helpful here is a syntax error pointer, to keep
you from thinking that the occurrence of bar over in the count()
has got anything to do with it.

> I believe the SQL to be valid (although I could not provide a standard
> reference);

It is not.  I believe the relevant restriction in SQL99 is 14.1
syntax rule 18.f.i.2.A.I: "T shall not be a grouped table" if
you are trying to use an ORDER BY key that isn't the name of an
output column of the SELECT list.  The point here really is that
there is no well-defined value of bar to order the single output row
with.  The fact that the ordering is pretty much a no-op doesn't
excuse a semantically nonsensical ORDER BY specification.

            regards, tom lane
On Thu, 15 May 2008 17:34:11 +0100, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> It's complaining about the occurrence of "bar" in ORDER BY, which is not
> within an aggregate function.
>
> I'm not sure that we can do much with the wording of the message.
> What would be more helpful here is a syntax error pointer, to keep
> you from thinking that the occurrence of bar over in the count()
> has got anything to do with it.

Yeah, that'd be nice. :)

Thanks anyway, Tom. :)


--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/