Thread: Why does aggregate query allow select of non-group by or aggregate values?
Why does aggregate query allow select of non-group by or aggregate values?
From
Jack Christensen
Date:
CREATE TABLE people( id serial PRIMARY KEY, name varchar NOT NULL ); INSERT INTO people(name) VALUES('Adam'), ('Adam'), ('Adam'), ('Bill'), ('Sam'), ('Joe'), ('Joe'); SELECT name, count(*), random() FROM people GROUP BY name; I would expect this query to cause an error because of random(). I ran into this using an array produced by a subquery as a column in the select of an aggregate query, but I was able to boil it down to this contrived example. Shouldn't any expression that is not in the group by or an aggregate function be rejected? What am I not understanding? Thanks. -- Jack Christensen jackc@hylesanderson.edu
Re: Why does aggregate query allow select of non-group by or aggregate values?
From
Henry Drexler
Date:
On Fri, Dec 9, 2011 at 5:48 PM, Jack Christensen <jackc@hylesanderson.edu> wrote:
CREATE TABLE people(
id serial PRIMARY KEY,
name varchar NOT NULL
);
INSERT INTO people(name) VALUES('Adam'), ('Adam'), ('Adam'), ('Bill'), ('Sam'), ('Joe'), ('Joe');
SELECT name, count(*), random()
FROM people
GROUP BY name;
I would expect this query to cause an error because of random(). I ran into this using an array produced by a subquery as a column in the select of an aggregate query, but I was able to boil it down to this contrived example. Shouldn't any expression that is not in the group by or an aggregate function be rejected?
What am I not understanding?
Thanks.
--
Jack Christensen
jackc@hylesanderson.edu
I don't know the answer, but I would guess that as random() is not known before hand , it has no reason being in the grouped by.
Re: Why does aggregate query allow select of non-group by or aggregate values?
From
"David Johnston"
Date:
-----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Jack Christensen Sent: Friday, December 09, 2011 5:48 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Why does aggregate query allow select of non-group by or aggregate values? CREATE TABLE people( id serial PRIMARY KEY, name varchar NOT NULL ); INSERT INTO people(name) VALUES('Adam'), ('Adam'), ('Adam'), ('Bill'), ('Sam'), ('Joe'), ('Joe'); SELECT name, count(*), random() FROM people GROUP BY name; I would expect this query to cause an error because of random(). I ran into this using an array produced by a subquery as a column in the select of an aggregate query, but I was able to boil it down to this contrived example. Shouldn't any expression that is not in the group by or an aggregate function be rejected? What am I not understanding? Thanks. ---------------------------------------------------------------------------- ------- Functions are evaluated once for each row that it generated by the surrounding query. This is particularly useful if the function in question takes an aggregate as an input: SELECT col1, array_processing_function( ARRAY_AGG( col2 ) ) FROM table GROUP BY col1; Without this particular behavior you would need to sub-query. From a layman's perspective the reason why you cannot use non-aggregates outside of GROUP BY it that it is ambiguous as to what value to output; with an uncorrelated function call that is not the case. David J.
Re: Why does aggregate query allow select of non-group by or aggregate values?
From
Adrian Klaver
Date:
On 12/09/2011 02:48 PM, Jack Christensen wrote: > CREATE TABLE people( > id serial PRIMARY KEY, > name varchar NOT NULL > ); > > INSERT INTO people(name) VALUES('Adam'), ('Adam'), ('Adam'), ('Bill'), > ('Sam'), ('Joe'), ('Joe'); > > SELECT name, count(*), random() > FROM people > GROUP BY name; > > > I would expect this query to cause an error because of random(). I ran > into this using an array produced by a subquery as a column in the > select of an aggregate query, but I was able to boil it down to this > contrived example. Shouldn't any expression that is not in the group by > or an aggregate function be rejected? > > What am I not understanding? http://www.postgresql.org/docs/9.0/interactive/sql-select.html#SQL-GROUPBY " Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value computed across all the selected rows). When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column." My guess, random() does not refer to a column, so it falls outside the above criteria. > > Thanks. > -- Adrian Klaver adrian.klaver@gmail.com
Re: Why does aggregate query allow select of non-group by or aggregate values?
From
Jack Christensen
Date:
On 12/9/2011 4:57 PM, David Johnston wrote: > Functions are evaluated once for each row that it generated by the > surrounding query. This is particularly useful if the function in question > takes an aggregate as an input: > > SELECT col1, array_processing_function( ARRAY_AGG( col2 ) ) > FROM table > GROUP BY col1; > > Without this particular behavior you would need to sub-query. > > > From a layman's perspective the reason why you cannot use non-aggregates > outside of GROUP BY it that it is ambiguous as to what value to output; with > an uncorrelated function call that is not the case. > > David J. > > > > Thanks. This makes sense now. I also went back to the original query that provoked this question. It had a correlated subquery in the select statement. I thought that this could yield ambiguous results. But when I examined it closely, all the correlated fields were included in the group by of the outer query, and when I tried to use a non-grouped column from the outer query I correctly got a ERROR: subquery uses ungrouped column "foo" from outer query Thanks again. -- Jack Christensen jackc@hylesanderson.edu