Home > mailing lists

Re: How slow is DISTINCT? - Mailing list pgsql-sql

From	Josh Berkus
Subject	Re: How slow is DISTINCT?
Date	February 27, 2002 16:49:25
Msg-id	web-812107@davinci.ethosmedia.com Whole thread Raw
In response to	How slow is DISTINCT? (Wei Weng <wweng@kencast.com>)
List	pgsql-sql

Tree view

Wei Wang,

> How exactly slow is DISTINCT being processed in SQL engines? (not
> limited to postgresql, though comments on postgresql would be most
>  relevant)

I can only give you a relative result, based exlusively on my anecdotalexperience with 7.1:

Fast:  SELECT ...
Slower: SELECT ... GROUP BY x,y,z    or:  SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
SLowest: SELECT DISTINCT ...

The reason for this is that SELECT DISTINCT is effectively a GROUP BYon all result fields of the query, and if a few of
thearen't indexedthat requires a seq scan.
 

If performance is an issue, you may wish to consider restructuring yourqueries and/or data model to eliminate the
actualduplicate rows.
 

-Josh

pgsql-sql by date:

From: "postgresql"
Date: 27 February 2002, 16:15:42
Subject: Re: Large Objects

From: "Dan Langille"
Date: 27 February 2002, 17:10:25
Subject: Re: How slow is DISTINCT?

Re: How slow is DISTINCT? - Mailing list pgsql-sql

Previous

Next