Thread: SELECT, GROUP BY, and aggregates
Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP BY as implicitly grouping by all the columns that aren't part of an aggregate? If I do this, Postgres throws an exception that I cannot SELECT a series of columns including an aggregate without a corresponding GROUP BY clause. But it knew to throw the error, right? It must have some method of knowing which columns aren't part of an aggregate. Or is it that a column might not have an aggregate, but still be hard to figure out how to group by it? But how would that happen? If I omit something from GROUP BY, it throws another exception. If I put something there that doesn't belong, I get a different exception. So it already knows how to do this! :P -- Regards, Ryan Delaney ryan.delaney@gmail.com https://github.com/rpdelaney GPG ID: 4096R/311C 10F2 26E0 14E3 8BA4 3B06 B634 36F1 C9E7 771B
Attachment
Ryan Delaney <ryan.delaney@gmail.com> writes: > Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP > BY as implicitly grouping by all the columns that aren't part of an aggregate? Per SQL standard, a SELECT with aggregates but no GROUP BY is supposed to give exactly one row. What you suggest would not do that. In general we're not that much into assigning made-up semantics to cases that are specifically disallowed by the spec. It's usually not too obvious what the result "should be", so we run the risk that the SQL committee might someday make a contrary decision. More, this would lose error detection, and reduce interoperability with other DBMSes that follow the spec more faithfully. regards, tom lane
> Ryan Delaney <ryan.delaney@gmail.com> writes: > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP > > BY as implicitly grouping by all the columns that aren't part of an aggregate? I'm Mr. Curious today ... Why would you think that such a thing is necessary or desirable? Simply add the columns to the GROUP BY clause and make the request unambiguous. -- Bill Moran
On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com> wrote:
> Ryan Delaney <ryan.delaney@gmail.com> writes:
> > Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP
> > BY as implicitly grouping by all the columns that aren't part of an aggregate?
I'm Mr. Curious today ...
Why would you think that such a thing is necessary or desirable? Simply add the
columns to the GROUP BY clause and make the request unambiguous.
Where would the ambiguity be?
I waste an inordinate amount of time retyping select lists over into the group by list, or copying and pasting and then deleting the aggregate clauses. It is an entirely pointless exercise. I can't fault PostgreSQL for following the standard, but its too bad the standards aren't more sensible.
Cheers,
Jeff
On Fri, 13 Feb 2015 10:48:13 -0800 Jeff Janes <jeff.janes@gmail.com> wrote: > On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com> > wrote: > > > > Ryan Delaney <ryan.delaney@gmail.com> writes: > > > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits > > the GROUP > > > > BY as implicitly grouping by all the columns that aren't part of an > > aggregate? > > > > I'm Mr. Curious today ... > > > > Why would you think that such a thing is necessary or desirable? Simply > > add the > > columns to the GROUP BY clause and make the request unambiguous. > > Where would the ambiguity be? With a large, complex query, trying to visually read through a list of column selections to figure out which ones _aren't_ aggregated and will be auto-GROUP-BYed would be ... tedious and error prone at best. You're right, though, it wouldn't be "ambiguous" ... that was a poor choice of words on my part. > I waste an inordinate amount of time retyping select lists over into the > group by list, or copying and pasting and then deleting the aggregate > clauses. Interesting ... I've never kept accurate track of the time I spend doing things like that, but "inordinate" seems like quite a lot. In my case, I'm a developer so I would tend toward creating code on the client side that automatically compiled the GROUP BY clause if I found that scenarios like you describe were happening frequently. Of course, that doesn't help a data anaylyst who's just writing queries > It is an entirely pointless exercise. I can't fault PostgreSQL > for following the standard, but its too bad the standards aren't more > sensible. I can't speak to the standard and it's reasons for doing this, but there are certainly some whacko things in the standard. Thanks for the response. -- Bill Moran
To lower the amount of time spent copy pasting aggregate column names, it's probably worth noting Postgres will allow you to short cut that with the column position. For example: select long_column_name_A, long_column_name_b, count(1) from foo group by 1,2 order by 1,2 This works just fine. It's not in the spec, but postgres supports it. I'll leave it to others to argue about it being a best practice or not. On Fri, Feb 13, 2015 at 1:57 PM, Bill Moran <wmoran@potentialtech.com> wrote: > On Fri, 13 Feb 2015 10:48:13 -0800 > Jeff Janes <jeff.janes@gmail.com> wrote: > >> On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com> >> wrote: >> >> > > Ryan Delaney <ryan.delaney@gmail.com> writes: >> > > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits >> > the GROUP >> > > > BY as implicitly grouping by all the columns that aren't part of an >> > aggregate? >> > >> > I'm Mr. Curious today ... >> > >> > Why would you think that such a thing is necessary or desirable? Simply >> > add the >> > columns to the GROUP BY clause and make the request unambiguous. >> >> Where would the ambiguity be? > > With a large, complex query, trying to visually read through a list of > column selections to figure out which ones _aren't_ aggregated and will > be auto-GROUP-BYed would be ... tedious and error prone at best. > > You're right, though, it wouldn't be "ambiguous" ... that was a poor > choice of words on my part. > >> I waste an inordinate amount of time retyping select lists over into the >> group by list, or copying and pasting and then deleting the aggregate >> clauses. > > Interesting ... I've never kept accurate track of the time I spend doing > things like that, but "inordinate" seems like quite a lot. > > In my case, I'm a developer so I would tend toward creating code on the > client side that automatically compiled the GROUP BY clause if I found > that scenarios like you describe were happening frequently. Of course, > that doesn't help a data anaylyst who's just writing queries > >> It is an entirely pointless exercise. I can't fault PostgreSQL >> for following the standard, but its too bad the standards aren't more >> sensible. > > I can't speak to the standard and it's reasons for doing this, but there > are certainly some whacko things in the standard. > > Thanks for the response. > > -- > Bill Moran > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
-----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Dunavant Sent: Friday, February 13, 2015 2:11 PM To: Bill Moran Cc: Jeff Janes; Ryan Delaney; pgsql-general@postgresql.org Subject: Re: [GENERAL] SELECT, GROUP BY, and aggregates To lower the amount of time spent copy pasting aggregate column names, it's probably worth noting Postgres will allow youto short cut that with the column position. For example: select long_column_name_A, long_column_name_b, count(1) from foo group by 1,2 order by 1,2 This works just fine. It's not in the spec, but postgres supports it. I'll leave it to others to argue about it being a best practice or not. ----------------------- I use this feature a lot. Igor Neyman
I will take a bit of a contrarian position from the OP. I, personally, prefer that computer _languages_ do exactly and only what _I_ tell them to do. I do _NOT_ want them to things for me. IMO, that is why many programs are unreliable. They make an assumption which is not what the original programmer really wanted. Of course, I _do_ like having an good IDE which will help me with suggestions which are based on what I have already typed in and what else is possible. In the OP's case, that would be something which would look at the SQL I have already typed in, and have a "highlighted" set of column names in the GROUP BY based on the names in the SELECT. This would complicate the editor, but (again in my opinion) this is where the help should be available. Basically, I want the _application_ programmer to be responsible for the SELECT, not the SQL engine programmer.
On Fri, Feb 13, 2015 at 1:11 PM, Brian Dunavant <brian@omniti.com> wrote:
To lower the amount of time spent copy pasting aggregate column names,
it's probably worth noting Postgres will allow you to short cut that
with the column position. For example:
select long_column_name_A, long_column_name_b, count(1)
from foo
group by 1,2
order by 1,2
This works just fine. It's not in the spec, but postgres supports it.
I'll leave it to others to argue about it being a best practice or
not.
On Fri, Feb 13, 2015 at 1:57 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> On Fri, 13 Feb 2015 10:48:13 -0800
> Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com>
>> wrote:
>>
>> > > Ryan Delaney <ryan.delaney@gmail.com> writes:
>> > > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits
>> > the GROUP
>> > > > BY as implicitly grouping by all the columns that aren't part of an
>> > aggregate?
>> >
>> > I'm Mr. Curious today ...
>> >
>> > Why would you think that such a thing is necessary or desirable? Simply
>> > add the
>> > columns to the GROUP BY clause and make the request unambiguous.
>>
>> Where would the ambiguity be?
>
> With a large, complex query, trying to visually read through a list of
> column selections to figure out which ones _aren't_ aggregated and will
> be auto-GROUP-BYed would be ... tedious and error prone at best.
>
> You're right, though, it wouldn't be "ambiguous" ... that was a poor
> choice of words on my part.
>
>> I waste an inordinate amount of time retyping select lists over into the
>> group by list, or copying and pasting and then deleting the aggregate
>> clauses.
>
> Interesting ... I've never kept accurate track of the time I spend doing
> things like that, but "inordinate" seems like quite a lot.
>
> In my case, I'm a developer so I would tend toward creating code on the
> client side that automatically compiled the GROUP BY clause if I found
> that scenarios like you describe were happening frequently. Of course,
> that doesn't help a data anaylyst who's just writing queries
>
>> It is an entirely pointless exercise. I can't fault PostgreSQL
>> for following the standard, but its too bad the standards aren't more
>> sensible.
>
> I can't speak to the standard and it's reasons for doing this, but there
> are certainly some whacko things in the standard.
>
> Thanks for the response.
>
> --
> Bill Moran
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
He's about as useful as a wax frying pan.
10 to the 12th power microphones = 1 Megaphone
John McKown
On 2/13/15 1:48 PM, Jeff Janes wrote: > I waste an inordinate amount of time retyping select lists over into the > group by list, or copying and pasting and then deleting the aggregate > clauses. It is an entirely pointless exercise. I can't fault > PostgreSQL for following the standard, but its too bad the standards > aren't more sensible. An extension like GROUP BY ALL might be useful, without breaking much. Also note that you can group by primary key only.