BUG #3979: SELECT DISTINCT slow even on indexed column - Mailing list pgsql-bugs

From David Lee
Subject BUG #3979: SELECT DISTINCT slow even on indexed column
Date
Msg-id 200802212334.m1LNYCHn084400@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #3979: SELECT DISTINCT slow even on indexed column  (Jeff Davis <pgsql@j-davis.com>)
Re: BUG #3979: SELECT DISTINCT slow even on indexed column  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-bugs
The following bug has been logged online:

Bug reference:      3979
Logged by:          David Lee
Email address:      david_lee@bigfix.com
PostgreSQL version: 8.2.6
Operating system:   Ubuntu Feisty Server
Description:        SELECT DISTINCT slow even on indexed column
Details:

\d x:

   Column     |            Type             | Modifiers
--------------+-----------------------------+-----------
 a            | integer                     | not null
 b            | integer                     | not null
 time         | timestamp without time zone | not null
 remote_time  | timestamp without time zone | not null
 ip           | inet                        | not null

The table has 20 million rows.

The table "x" has an index on ("a", "b").

I first tried:
 SELECT DISTINCT a, b FROM x

but it was so slow.

I ran EXPLAIN and it showed that the path did not use the index, so I ran:

 SET enable_seqscan = off;

and ran the query again.

Although it used the index, the query was still very slow.

Finally, I ran:
 SELECT a, b FROM x GROUP BY a, b;

But it was still the same.

Next I created an index on ("a") and ran the query:
 SELECT DISTINCT a FROM x

but the same thing happened (first didn't use the index; after turning
seq-scan off, was still slow; tried using GROUP BY, still slow).

The columns "a" and "b" are NOT NULL and has 100 distinct values each. The
indexes are all btree indexes.

pgsql-bugs by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Incomplete docs for restore_command for hot standby
Next
From: Jeff Davis
Date:
Subject: Re: BUG #3979: SELECT DISTINCT slow even on indexed column