Thread: BUG #14011: select count(distinct column) does not utilizes indices to improve performance
BUG #14011: select count(distinct column) does not utilizes indices to improve performance
From
gtakahashi@palantir.com
Date:
The following bug has been logged on the website: Bug reference: 14011 Logged by: Glen Takahashi Email address: gtakahashi@palantir.com PostgreSQL version: 9.3.6 Operating system: Red Hat Enterprise Linux 6.7 Description: Doing a select count(distinct column) does not utilize indices and takes a very long time to compute. For example Create table test_table (number integer); Create index on test_table (number); Insert into test_table (number) select generate_series(1,100000); (I did this ~500 times to replicate the scale of data we have) Select count(distinct number) from test_table; Count 100000 Time: 63419.600ms Select count(1) from (select distinct number from test_table) a; Count 100000 Time: 10743.186 ms
Re: BUG #14011: select count(distinct column) does not utilizes indices to improve performance
From
Tom Lane
Date:
gtakahashi@palantir.com writes: > Doing a select count(distinct column) does not utilize indices Nope, it does not. Don't hold your breath waiting for that to change; it might sometime, but nobody is working on it that I know of. You'd have better luck with a "SELECT count(*) GROUP BY column" type of query, I expect. (Most likely that won't use an index either, BTW, but it'll be smarter than count(distinct).) regards, tom lane