Thread: select distinct

select distinct

From
"Timothy H. Keitt"
Date:
I've noticed that adding an index on a column has no effect on "select
distinct" (index or no index, the entire table is sequence scanned; see
example below).  Couldn't you just traverse the index to get the
distinct values?

Tim

keitt=# vacuum analyze test;
VACUUM
keitt=# select distinct a from test;a 
---abcdfsz
(7 rows)

keitt=# explain select distinct a from test;
NOTICE:  QUERY PLAN:

Unique  (cost=1.67..1.72 rows=2 width=12) ->  Sort  (cost=1.67..1.67 rows=21 width=12)       ->  Seq Scan on test
(cost=0.00..1.21rows=21 width=12)
 

EXPLAIN
keitt=# create index test_index on test (a);
CREATE
keitt=# explain select distinct a from test;
NOTICE:  QUERY PLAN:

Unique  (cost=1.67..1.72 rows=2 width=12) ->  Sort  (cost=1.67..1.67 rows=21 width=12)       ->  Seq Scan on test
(cost=0.00..1.21rows=21 width=12)
 

EXPLAIN


-- 
Timothy H. Keitt
National Center for Ecological Analysis and Synthesis
735 State Street, Suite 300, Santa Barbara, CA 93101
Phone: 805-892-2519, FAX: 805-892-2510
http://www.nceas.ucsb.edu/~keitt/


Re: select distinct

From
Tom Lane
Date:
"Timothy H. Keitt" <keitt@nceas.ucsb.edu> writes:
> Couldn't you just traverse the index to get the distinct values?

No, because the index doesn't contain commit status; you can't tell
which values are actually valid without visiting the main table.

7.0 does consider both indexscan | unique and seqscan | sort | unique
plans for this problem, but the explicit sort is usually faster
(according to the system's cost models, anyway).
        regards, tom lane