Thread: SELECT, GROUP BY, and aggregates

SELECT, GROUP BY, and aggregates

From
Ryan Delaney
Date:
Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP
BY as implicitly grouping by all the columns that aren't part of an aggregate?

If I do this, Postgres throws an exception that I cannot SELECT a series of
columns including an aggregate without a corresponding GROUP BY clause. But it
knew to throw the error, right?  It must have some method of knowing which
columns aren't part of an aggregate.  Or is it that a column might not have an
aggregate, but still be hard to figure out how to group by it?

But how would that happen?

If I omit something from GROUP BY, it throws another exception. If I put
something there that doesn't belong, I get a different exception. So it already
knows how to do this! :P
--
Regards,
Ryan Delaney
ryan.delaney@gmail.com
https://github.com/rpdelaney
GPG ID: 4096R/311C 10F2 26E0 14E3 8BA4  3B06 B634 36F1 C9E7 771B

Attachment

Re: SELECT, GROUP BY, and aggregates

From
Tom Lane
Date:
Ryan Delaney <ryan.delaney@gmail.com> writes:
> Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP
> BY as implicitly grouping by all the columns that aren't part of an aggregate?

Per SQL standard, a SELECT with aggregates but no GROUP BY is supposed to
give exactly one row.  What you suggest would not do that.

In general we're not that much into assigning made-up semantics to cases
that are specifically disallowed by the spec.  It's usually not too
obvious what the result "should be", so we run the risk that the SQL
committee might someday make a contrary decision.  More, this would lose
error detection, and reduce interoperability with other DBMSes that follow
the spec more faithfully.

            regards, tom lane


Re: SELECT, GROUP BY, and aggregates

From
Bill Moran
Date:
> Ryan Delaney <ryan.delaney@gmail.com> writes:
> > Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP
> > BY as implicitly grouping by all the columns that aren't part of an aggregate?

I'm Mr. Curious today ...

Why would you think that such a thing is necessary or desirable? Simply add the
columns to the GROUP BY clause and make the request unambiguous.

--
Bill Moran


Re: SELECT, GROUP BY, and aggregates

From
Jeff Janes
Date:
On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com> wrote:

> Ryan Delaney <ryan.delaney@gmail.com> writes:
> > Why couldn't an RDBMS such as postgres interpret a SELECT that omits the GROUP
> > BY as implicitly grouping by all the columns that aren't part of an aggregate?

I'm Mr. Curious today ...

Why would you think that such a thing is necessary or desirable? Simply add the
columns to the GROUP BY clause and make the request unambiguous.


Where would the ambiguity be?

I waste an inordinate amount of time retyping select lists over into the group by list, or copying and pasting and then deleting the aggregate clauses.  It is an entirely pointless exercise.  I can't fault PostgreSQL for following the standard, but its too bad the standards aren't more sensible.

Cheers,

Jeff

Re: SELECT, GROUP BY, and aggregates

From
Bill Moran
Date:
On Fri, 13 Feb 2015 10:48:13 -0800
Jeff Janes <jeff.janes@gmail.com> wrote:

> On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com>
> wrote:
>
> > > Ryan Delaney <ryan.delaney@gmail.com> writes:
> > > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits
> > the GROUP
> > > > BY as implicitly grouping by all the columns that aren't part of an
> > aggregate?
> >
> > I'm Mr. Curious today ...
> >
> > Why would you think that such a thing is necessary or desirable? Simply
> > add the
> > columns to the GROUP BY clause and make the request unambiguous.
>
> Where would the ambiguity be?

With a large, complex query, trying to visually read through a list of
column selections to figure out which ones _aren't_ aggregated and will
be auto-GROUP-BYed would be ... tedious and error prone at best.

You're right, though, it wouldn't be "ambiguous" ... that was a poor
choice of words on my part.

> I waste an inordinate amount of time retyping select lists over into the
> group by list, or copying and pasting and then deleting the aggregate
> clauses.

Interesting ... I've never kept accurate track of the time I spend doing
things like that, but "inordinate" seems like quite a lot.

In my case, I'm a developer so I would tend toward creating code on the
client side that automatically compiled the GROUP BY clause if I found
that scenarios like you describe were happening frequently. Of course,
that doesn't help a data anaylyst who's just writing queries

> It is an entirely pointless exercise.  I can't fault PostgreSQL
> for following the standard, but its too bad the standards aren't more
> sensible.

I can't speak to the standard and it's reasons for doing this, but there
are certainly some whacko things in the standard.

Thanks for the response.

--
Bill Moran


Re: SELECT, GROUP BY, and aggregates

From
Brian Dunavant
Date:
To lower the amount of time spent copy pasting aggregate column names,
it's probably worth noting Postgres will allow you to short cut that
with the column position.  For example:

select long_column_name_A, long_column_name_b, count(1)
from foo
group by 1,2
order by 1,2

This works just fine.  It's not in the spec, but postgres supports it.
I'll leave it to others to argue about it being a best practice or
not.


On Fri, Feb 13, 2015 at 1:57 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> On Fri, 13 Feb 2015 10:48:13 -0800
> Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com>
>> wrote:
>>
>> > > Ryan Delaney <ryan.delaney@gmail.com> writes:
>> > > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits
>> > the GROUP
>> > > > BY as implicitly grouping by all the columns that aren't part of an
>> > aggregate?
>> >
>> > I'm Mr. Curious today ...
>> >
>> > Why would you think that such a thing is necessary or desirable? Simply
>> > add the
>> > columns to the GROUP BY clause and make the request unambiguous.
>>
>> Where would the ambiguity be?
>
> With a large, complex query, trying to visually read through a list of
> column selections to figure out which ones _aren't_ aggregated and will
> be auto-GROUP-BYed would be ... tedious and error prone at best.
>
> You're right, though, it wouldn't be "ambiguous" ... that was a poor
> choice of words on my part.
>
>> I waste an inordinate amount of time retyping select lists over into the
>> group by list, or copying and pasting and then deleting the aggregate
>> clauses.
>
> Interesting ... I've never kept accurate track of the time I spend doing
> things like that, but "inordinate" seems like quite a lot.
>
> In my case, I'm a developer so I would tend toward creating code on the
> client side that automatically compiled the GROUP BY clause if I found
> that scenarios like you describe were happening frequently. Of course,
> that doesn't help a data anaylyst who's just writing queries
>
>> It is an entirely pointless exercise.  I can't fault PostgreSQL
>> for following the standard, but its too bad the standards aren't more
>> sensible.
>
> I can't speak to the standard and it's reasons for doing this, but there
> are certainly some whacko things in the standard.
>
> Thanks for the response.
>
> --
> Bill Moran
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: SELECT, GROUP BY, and aggregates

From
Igor Neyman
Date:

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Dunavant
Sent: Friday, February 13, 2015 2:11 PM
To: Bill Moran
Cc: Jeff Janes; Ryan Delaney; pgsql-general@postgresql.org
Subject: Re: [GENERAL] SELECT, GROUP BY, and aggregates

To lower the amount of time spent copy pasting aggregate column names, it's probably worth noting Postgres will allow
youto short cut that with the column position.  For example:
 

select long_column_name_A, long_column_name_b, count(1) from foo group by 1,2 order by 1,2

This works just fine.  It's not in the spec, but postgres supports it.
I'll leave it to others to argue about it being a best practice or not.

-----------------------

I use this feature a lot.

Igor Neyman

Re: SELECT, GROUP BY, and aggregates

From
John McKown
Date:
I will take a bit of a contrarian position from the OP. I, personally, prefer that computer _languages_ do exactly and only what _I_ tell them to do. I do _NOT_ want them to things for me. IMO, that is why many programs are unreliable. They make an assumption which is not what the original programmer really wanted. Of course, I _do_ like having an good IDE which will help me with suggestions which are based on what I have already typed in and what else is possible. In the OP's case, that would be something which would look at the SQL I have already typed in, and have a "highlighted" set of column names in the GROUP BY based on the names in the SELECT. This would complicate the editor, but (again in my opinion) this is where the help should be available. Basically, I want the _application_ programmer to be responsible for the SELECT, not the SQL engine programmer.

On Fri, Feb 13, 2015 at 1:11 PM, Brian Dunavant <brian@omniti.com> wrote:
To lower the amount of time spent copy pasting aggregate column names,
it's probably worth noting Postgres will allow you to short cut that
with the column position.  For example:

select long_column_name_A, long_column_name_b, count(1)
from foo
group by 1,2
order by 1,2

This works just fine.  It's not in the spec, but postgres supports it.
I'll leave it to others to argue about it being a best practice or
not.


On Fri, Feb 13, 2015 at 1:57 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> On Fri, 13 Feb 2015 10:48:13 -0800
> Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> On Fri, Feb 13, 2015 at 10:26 AM, Bill Moran <wmoran@potentialtech.com>
>> wrote:
>>
>> > > Ryan Delaney <ryan.delaney@gmail.com> writes:
>> > > > Why couldn't an RDBMS such as postgres interpret a SELECT that omits
>> > the GROUP
>> > > > BY as implicitly grouping by all the columns that aren't part of an
>> > aggregate?
>> >
>> > I'm Mr. Curious today ...
>> >
>> > Why would you think that such a thing is necessary or desirable? Simply
>> > add the
>> > columns to the GROUP BY clause and make the request unambiguous.
>>
>> Where would the ambiguity be?
>
> With a large, complex query, trying to visually read through a list of
> column selections to figure out which ones _aren't_ aggregated and will
> be auto-GROUP-BYed would be ... tedious and error prone at best.
>
> You're right, though, it wouldn't be "ambiguous" ... that was a poor
> choice of words on my part.
>
>> I waste an inordinate amount of time retyping select lists over into the
>> group by list, or copying and pasting and then deleting the aggregate
>> clauses.
>
> Interesting ... I've never kept accurate track of the time I spend doing
> things like that, but "inordinate" seems like quite a lot.
>
> In my case, I'm a developer so I would tend toward creating code on the
> client side that automatically compiled the GROUP BY clause if I found
> that scenarios like you describe were happening frequently. Of course,
> that doesn't help a data anaylyst who's just writing queries
>
>> It is an entirely pointless exercise.  I can't fault PostgreSQL
>> for following the standard, but its too bad the standards aren't more
>> sensible.
>
> I can't speak to the standard and it's reasons for doing this, but there
> are certainly some whacko things in the standard.
>
> Thanks for the response.
>
> --
> Bill Moran
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

Re: SELECT, GROUP BY, and aggregates

From
Peter Eisentraut
Date:
On 2/13/15 1:48 PM, Jeff Janes wrote:
> I waste an inordinate amount of time retyping select lists over into the
> group by list, or copying and pasting and then deleting the aggregate
> clauses.  It is an entirely pointless exercise.  I can't fault
> PostgreSQL for following the standard, but its too bad the standards
> aren't more sensible.

An extension like GROUP BY ALL might be useful, without breaking much.

Also note that you can group by primary key only.