Re: How slow is DISTINCT? - Mailing list pgsql-sql

From Josh Berkus
Subject Re: How slow is DISTINCT?
Date
Msg-id web-812107@davinci.ethosmedia.com
Whole thread Raw
In response to How slow is DISTINCT?  (Wei Weng <wweng@kencast.com>)
List pgsql-sql
Wei Wang,

> How exactly slow is DISTINCT being processed in SQL engines? (not
> limited to postgresql, though comments on postgresql would be most
>  relevant)

I can only give you a relative result, based exlusively on my anecdotalexperience with 7.1:

Fast:  SELECT ...
Slower: SELECT ... GROUP BY x,y,z    or:  SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
SLowest: SELECT DISTINCT ...

The reason for this is that SELECT DISTINCT is effectively a GROUP BYon all result fields of the query, and if a few of
thearen't indexedthat requires a seq scan.
 

If performance is an issue, you may wish to consider restructuring yourqueries and/or data model to eliminate the
actualduplicate rows.
 

-Josh


pgsql-sql by date:

Previous
From: "postgresql"
Date:
Subject: Re: Large Objects
Next
From: "Dan Langille"
Date:
Subject: Re: How slow is DISTINCT?