Thread: way to speed up a SELECT DISTINCT?

way to speed up a SELECT DISTINCT?

From

Seth Ladd

Date:

10 October 2003, 06:42:30

Hello,

I am running 7.3.2 RPMs on RH9, on a celeron 1.7 w/ 1gig ram.

I have a table that has 6.9 million rows, 2 columns, and an index on
each column.  When I run:

SELECT DISTINCT column1 FROM table

It is very, very slow (10-15 min to complete).  An EXPLAIN shows no
indexes are being used.

Is there any way to speed this up, or is that DISTINCT going to keep
hounding me?

I checked the mailing list, and didn't see anything like this.

Any tips or hints would be greatly appreciated.  Thanks for your help!
Seth

Re: way to speed up a SELECT DISTINCT?

From

Peter Childs

Date:

10 October 2003, 07:07:53

On Thu, 9 Oct 2003, Seth Ladd wrote:

> Hello,
>
> I am running 7.3.2 RPMs on RH9, on a celeron 1.7 w/ 1gig ram.
>
> I have a table that has 6.9 million rows, 2 columns, and an index on
> each column.  When I run:
>
> SELECT DISTINCT column1 FROM table
>
> It is very, very slow (10-15 min to complete).  An EXPLAIN shows no
> indexes are being used.
>
> Is there any way to speed this up, or is that DISTINCT going to keep
> hounding me?
>
> I checked the mailing list, and didn't see anything like this.
>
> Any tips or hints would be greatly appreciated.  Thanks for your help!
> Seth
>
>
    Try group by instead. I think this is an old bug its fixed in
7.3.2 which I'm using.

Peter Childs
`


peter@bernardo:express=# explain select distinct region from region;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Unique  (cost=0.00..4326.95 rows=9518 width=14)
   ->  Index Scan using regionview_region on region  (cost=0.00..4089.00
rows=95183 width=14)
(2 rows)

peter@bernardo:express=# explain select distinct region from region group
by region;
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..4350.75 rows=952 width=14)
   ->  Group  (cost=0.00..4326.95 rows=9518 width=14)
         ->  Index Scan using regionview_region on region
(cost=0.00..4089.00 rows=95183 width=14)
(3 rows)

Re: way to speed up a SELECT DISTINCT?

From

Seth Ladd

Date:

10 October 2003, 07:51:40

>> Is there any way to speed this up, or is that DISTINCT going to keep
>> hounding me?
>>
>> I checked the mailing list, and didn't see anything like this.
>>
>> Any tips or hints would be greatly appreciated.  Thanks for your help!
>> Seth
>>
>>
>     Try group by instead. I think this is an old bug its fixed in
> 7.3.2 which I'm using.
>
> Peter Childs
> `
>
>
> peter@bernardo:express=# explain select distinct region from region;
>                                           QUERY PLAN
> -----------------------------------------------------------------------
> -----------------------
>  Unique  (cost=0.00..4326.95 rows=9518 width=14)
>    ->  Index Scan using regionview_region on region
> (cost=0.00..4089.00
> rows=95183 width=14)
> (2 rows)

Thanks for the tip, I'll give this a shot soon.  I am curious, your
example above does not use GROUP BY yet you have an INDEX SCAN.  I am
using a similar query, yet I get a full table scan.  I wonder how they
are different?

I'll try the group by anyway.

Thanks,
Seth

Re: way to speed up a SELECT DISTINCT?

From

Shridhar Daithankar

Date:

10 October 2003, 08:01:23

Seth Ladd wrote:

>> peter@bernardo:express=# explain select distinct region from region;
>>                                           QUERY PLAN
>> -----------------------------------------------------------------------
>> -----------------------
>>  Unique  (cost=0.00..4326.95 rows=9518 width=14)
>>    ->  Index Scan using regionview_region on region   (cost=0.00..4089.00
>> rows=95183 width=14)
>> (2 rows)
>
>
> Thanks for the tip, I'll give this a shot soon.  I am curious, your
> example above does not use GROUP BY yet you have an INDEX SCAN.  I am
> using a similar query, yet I get a full table scan.  I wonder how they
> are different?

Have you tuned your shared buffers and effective cache correctly?

  Shridhar

Re: way to speed up a SELECT DISTINCT?

From

Christopher Kings-Lynne

Date:

10 October 2003, 08:06:24

> Thanks for the tip, I'll give this a shot soon.  I am curious, your
> example above does not use GROUP BY yet you have an INDEX SCAN.  I am
> using a similar query, yet I get a full table scan.  I wonder how they
> are different?

Please send us the results of EXPLAIN ANALYZE the query.  The EXPLAIN
results usually aren't too interesting for degenerate queries.

Also, make sure you have run ANALYZE on your database.

Chris

Re: way to speed up a SELECT DISTINCT?

From

Peter Childs

Date:

10 October 2003, 08:17:41

On Fri, 10 Oct 2003, Seth Ladd wrote:

> >> Is there any way to speed this up, or is that DISTINCT going to keep
> >> hounding me?
> >>
> >> I checked the mailing list, and didn't see anything like this.
> >>
> >> Any tips or hints would be greatly appreciated.  Thanks for your help!
> >> Seth
> >>
> >>
> >     Try group by instead. I think this is an old bug its fixed in
> > 7.3.2 which I'm using.
> >
> > Peter Childs
> > `
> >
> >
> > peter@bernardo:express=# explain select distinct region from region;
> >                                           QUERY PLAN
> > -----------------------------------------------------------------------
> > -----------------------
> >  Unique  (cost=0.00..4326.95 rows=9518 width=14)
> >    ->  Index Scan using regionview_region on region
> > (cost=0.00..4089.00
> > rows=95183 width=14)
> > (2 rows)
>
> Thanks for the tip, I'll give this a shot soon.  I am curious, your
> example above does not use GROUP BY yet you have an INDEX SCAN.  I am
> using a similar query, yet I get a full table scan.  I wonder how they
> are different?
>
> I'll try the group by anyway.
>
    Its a guess but ANALYSE might help. `

Peter Childs