Re: select distinct and index usage - Mailing list pgsql-general

From Alvaro Herrera
Subject Re: select distinct and index usage
Date
Msg-id 20080408010124.GG5095@alvh.no-ip.org
Whole thread Raw
In response to Re: select distinct and index usage  ("David Wilson" <david.t.wilson@gmail.com>)
List pgsql-general
David Wilson escribió:

> explain analyze select datestamp from vals group by datestamp;
>                                                              QUERY
> PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=1719740.40..1719783.03 rows=4263 width=4)
> (actual time=120192.018..120193.930 rows=4252 loops=1)
>    ->  Seq Scan on vals  (cost=0.00..1531261.72 rows=75391472 width=4)
> (actual time=17.441..66807.429 rows=75391476 loops=1)
>  Total runtime: 120195.144 ms

> Still doing the sequential scan on the table, but at least it's
> avoiding the expensive disk merge sort. It still seems as if I ought
> to be able to coax it into using an index for this type of query,
> though- especially since it's using one on the other table. Is there
> perhaps some way to reformulate the index in such a way as to make it
> more useful to the planner?

Hmm, why do you think an indexscan would be faster?  Since there's no
sort step involved, a seqscan as input for the HashAggregate is actually
better than an indexscan, because there's no need for the index entries
at all.

If you want to test, try SET enable_seqscan TO 0 and then rerun the
explain analyze.  My bet is that it will use the index, and it will take
longer.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

pgsql-general by date:

Previous
From: "David Wilson"
Date:
Subject: Re: select distinct and index usage
Next
From: Colin Wetherbee
Date:
Subject: Re: Most Occurring Value