Speeding up select distinct - Mailing list pgsql-performance

From Laurent Martelli
Subject Speeding up select distinct
Date
Msg-id 87k6o74gms.fsf@stan.aopsys
Whole thread Raw
Responses Re: Speeding up select distinct
Re: Speeding up select distinct
List pgsql-performance
Consider this query:

SELECT distinct owner from pictures;

 Unique  (cost=361.18..382.53 rows=21 width=4) (actual time=14.197..17.639 rows=21 loops=1)
   ->  Sort  (cost=361.18..371.86 rows=4270 width=4) (actual time=14.188..15.450 rows=4270 loops=1)
         Sort Key: "owner"
         ->  Seq Scan on pictures  (cost=0.00..103.70 rows=4270 width=4) (actual time=0.012..5.795 rows=4270 loops=1)
 Total runtime: 19.147 ms

I thought that 19ms to return 20 rows out of a 4000 rows table so I
added an index:

CREATE INDEX pictures_owner ON pictures (owner);

It gives a slight improvement:

 Unique  (cost=0.00..243.95 rows=21 width=4) (actual time=0.024..10.293 rows=21 loops=1)
   ->  Index Scan using pictures_owner on pictures  (cost=0.00..233.27 rows=4270 width=4) (actual time=0.022..8.227
rows=4270loops=1) 
 Total runtime: 10.369 ms

But still, it's a lot for 20 rows. I looked at other type of indexes,
but they seem to either not give beter perfs or be irrelevant.

Any ideas, apart from more or less manually maintaining a list of
distinct owners in another table ?

--
Laurent Martelli
laurent@aopsys.com                                Java Aspect Components
http://www.aopsys.com/                          http://jac.objectweb.org


pgsql-performance by date:

Previous
From: David Gagnon
Date:
Subject: Re: Performance problem on delete from for 10k rows. May
Next
From: Josh Berkus
Date:
Subject: Re: cpu_tuple_cost