Home > mailing lists

Re: BUG #3979: SELECT DISTINCT slow even on indexed column - Mailing list pgsql-bugs

From	Jeff Davis
Subject	Re: BUG #3979: SELECT DISTINCT slow even on indexed column
Date	February 21, 2008 20:37:51
Msg-id	1203640662.7878.33.camel@dogma.ljc.laika.com Whole thread Raw
In response to	BUG #3979: SELECT DISTINCT slow even on indexed column ("David Lee" <david_lee@bigfix.com>)
List	pgsql-bugs

Tree view

On Thu, 2008-02-21 at 23:34 +0000, David Lee wrote:
> Finally, I ran:
>  SELECT a, b FROM x GROUP BY a, b;
>
> But it was still the same.
>
> Next I created an index on ("a") and ran the query:
>  SELECT DISTINCT a FROM x
>
> but the same thing happened (first didn't use the index; after turning
> seq-scan off, was still slow; tried using GROUP BY, still slow).
>
> The columns "a" and "b" are NOT NULL and has 100 distinct values each. The
> indexes are all btree indexes.

If there are only 100 distinct values each, then that's only (at most)
10k distinct (a,b) pairs.

To me it sounds like it would be most efficient to use a HashAggregate,
which can only be used by the "GROUP BY" variant of the query you ran
(DISTINCT can't use that plan).

First, try to force a HashAggregate and see what the results are. If
that is faster, the planner is not choosing the right plan. Try ANALYZE
to update the statistics, and if that doesn't work, post EXPLAIN
results.

Also, this post is somewhat off-topic for -bugs, try posting to -general
or -performance with this type of question.

Regards,
    Jeff Davis

pgsql-bugs by date:

From: "David Lee"
Date: 21 February 2008, 20:01:07
Subject: BUG #3979: SELECT DISTINCT slow even on indexed column

From: Euler Taveira de Oliveira
Date: 21 February 2008, 22:54:40
Subject: Re: BUG #3975: tsearch2 index should not bomb out of 1Mb limit

Re: BUG #3979: SELECT DISTINCT slow even on indexed column - Mailing list pgsql-bugs

Previous

Next