Thread: Re: [HACKERS] No: implied sort with group by

Re: [HACKERS] No: implied sort with group by

From
darrenk@insightdist.com (Darren King)
Date:
>
> > Does the SQL standard say anything about an implied sort when
> > grouping or is it up to the user to include an ORDER BY clause?
> > darrenk
>
> Without order by the result set is never guaranteed to be ordered in a
> specific way (standard speak). The order is dependent on the chosen
> query path, which changes from query to query.
>
> Andreas

Postgres should then do an internal sort before grouping.  In the second
of your examples, I take the above to mean that either row could be
returned first.

In order to get that result set though, the data needs to be sorted before
getting to the group by node in the executor.  The order of that internal
sort is purely arbitrary, it just has to be done.

This is what I think is missing or broken right now.

> > select * from t1;
>          a b  c
>          1    x
>          2    x
>          3    z
>          2    x
>
> 4 row(s) retrieved.
> > select b,c,sum(a) from t1 group by b,c;
> b  c             (sum)
>
>    x                 5
>    z                 3
>
> 2 row(s) retrieved.

darrenk



Re: [HACKERS] No: implied sort with group by

From
"Thomas G. Lockhart"
Date:
> > > Does the SQL standard say anything about an implied sort when
> > > grouping or is it up to the user to include an ORDER BY clause?

Up to the user. SQL is a set-oriented language. The fact that many/most/all
implementations order results to then do grouping is an implementation
detail, not a language definition.


> This is what I think is missing or broken right now.
>
> > > select * from t1;
> >          a b  c
> >          1    x
> >          2    x
> >          3    z
> >          2    x
> >
> > 4 row(s) retrieved.
> > > select b,c,sum(a) from t1 group by b,c;
> > b  c             (sum)
> >
> >    x                 5
> >    z                 3
> >> 2 row(s) retrieved.

Sorry, I've lost the thread. What is broken? I get this same result, and
(assuming that column "b" is full of nulls) I think this the correct result.

                                                        - Tom


Re: [HACKERS] No: implied sort with group by

From
Bruce Momjian
Date:
>
> > > > Does the SQL standard say anything about an implied sort when
> > > > grouping or is it up to the user to include an ORDER BY clause?
>
> Up to the user. SQL is a set-oriented language. The fact that many/most/all
> implementations order results to then do grouping is an implementation
> detail, not a language definition.
>
>
> > This is what I think is missing or broken right now.
> >
> > > > select * from t1;
> > >          a b  c
> > >          1    x
> > >          2    x
> > >          3    z
> > >          2    x
> > >
> > > 4 row(s) retrieved.
> > > > select b,c,sum(a) from t1 group by b,c;
> > > b  c             (sum)
> > >
> > >    x                 5
> > >    z                 3
> > >> 2 row(s) retrieved.
>
> Sorry, I've lost the thread. What is broken? I get this same result, and
> (assuming that column "b" is full of nulls) I think this the correct result.

At one point, it was thought that NULLs shouldn't be grouped, but I
backed out the patch.  There is a problem with GROUP BY on large
datasets, and Vadim knows the cause, and will work on it later.

--
Bruce Momjian
maillist@candle.pha.pa.us