Re: GROUP BY vs DISTINCT - Mailing list pgsql-performance

From Peter Childs
Subject Re: GROUP BY vs DISTINCT
Date
Msg-id a2de01dd0612200316g92cc189jf3369ccedf1b8c12@mail.gmail.com
Whole thread Raw
In response to Re: GROUP BY vs DISTINCT  ("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
Responses Re: GROUP BY vs DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On 20/12/06, Steinar H. Gunderson <sgunderson@bigfoot.com> wrote:
> On Tue, Dec 19, 2006 at 11:19:39PM -0800, Brian Herlihy wrote:
> > Actually, I think I answered my own question already.  But I want to
> > confirm - Is the GROUP BY faster because it doesn't have to sort results,
> > whereas DISTINCT must produce sorted results?  This wasn't clear to me from
> > the documentation.  If it's true, then I could save considerable time by
> > using GROUP BY where I have been using DISTINCT in the past.  Usually I
> > simply want a count of the distinct values, and there is no need to sort
> > for that.
>
> You are right; at the moment, GROUP BY is more intelligent than DISTINCT,
> even if they have to compare the same columns. This is, as always, something
> that could be improved in a future release, TTBOMK.
>
> /* Steinar */

Oh so thats why group by is nearly always quicker than distinct. I
always thought distinct was just short hand for "group by same columns
as I've just selected"
Is it actually in the sql spec to sort in a distinct or could we just
get the parser to rewrite distinct into group by and hence remove the
extra code a different way of doing it must mean.?

Peter.

pgsql-performance by date:

Previous
From: "Steinar H. Gunderson"
Date:
Subject: Re: GROUP BY vs DISTINCT
Next
From: CARMODA
Date:
Subject: Question: Clustering & Load Balancing