Home > mailing lists

Speeding up select distinct - Mailing list pgsql-performance

From	Laurent Martelli
Subject	Speeding up select distinct
Date	March 16, 2005 17:59:01
Msg-id	87k6o74gms.fsf@stan.aopsys Whole thread Raw
Responses	Re: Speeding up select distinct Re: Speeding up select distinct
List	pgsql-performance

Tree view

Consider this query:

SELECT distinct owner from pictures;

 Unique  (cost=361.18..382.53 rows=21 width=4) (actual time=14.197..17.639 rows=21 loops=1)
   ->  Sort  (cost=361.18..371.86 rows=4270 width=4) (actual time=14.188..15.450 rows=4270 loops=1)
         Sort Key: "owner"
         ->  Seq Scan on pictures  (cost=0.00..103.70 rows=4270 width=4) (actual time=0.012..5.795 rows=4270 loops=1)
 Total runtime: 19.147 ms

I thought that 19ms to return 20 rows out of a 4000 rows table so I
added an index:

CREATE INDEX pictures_owner ON pictures (owner);

It gives a slight improvement:

 Unique  (cost=0.00..243.95 rows=21 width=4) (actual time=0.024..10.293 rows=21 loops=1)
   ->  Index Scan using pictures_owner on pictures  (cost=0.00..233.27 rows=4270 width=4) (actual time=0.022..8.227
rows=4270loops=1) 
 Total runtime: 10.369 ms

But still, it's a lot for 20 rows. I looked at other type of indexes,
but they seem to either not give beter perfs or be irrelevant.

Any ideas, apart from more or less manually maintaining a list of
distinct owners in another table ?

--
Laurent Martelli
laurent@aopsys.com                                Java Aspect Components
http://www.aopsys.com/                          http://jac.objectweb.org

pgsql-performance by date:

From: David Gagnon
Date: 16 March 2005, 17:00:28
Subject: Re: Performance problem on delete from for 10k rows. May

From: Josh Berkus
Date: 16 March 2005, 18:04:21
Subject: Re: cpu_tuple_cost

Speeding up select distinct - Mailing list pgsql-performance

Previous

Next