Thread: Slow SELECT with distinct, in a TIMESTAMP type column

Slow SELECT with distinct, in a TIMESTAMP type column

From

Marcos Garcia

Date:

24 June 2002, 12:59:42

Hi,


I've a table "send_total" like this:
                                     Table "send_total" Column   |           Type           |
Modifiers
-----------+--------------------------+-------------------------------------------------------id        | integer
          | not null default 
nextval('"send_total_id_seq"'::text)smsdate   | timestamp with time zone | idclient  | integer                  | not
nullrecv     | integer                  | default 0send      | integer                  | default 0recv_conf | integer
               | default 0send_conf | integer                  | default 0conf0     | integer                  |
default0conf1     | integer                  | default 0conf2     | integer                  | default 0conf3     |
integer                 | default 0conf4     | integer                  | default 0conf5     | integer
|default 0conf6     | integer                  | default 0conf7     | integer                  | default 0 
Indexes: send_total_idclient_idx,        send_total_smsdate_idx
Primary key: send_total_pkey
Unique keys: send_total_idclismsdate
(idclient and smsdate must be unique)


The problem is that my query is too slow when i use distinct:

pgsql> explain analyze SELECT distinct(smsdate) FROM send_total;

NOTICE:  QUERY PLAN:

Unique  (cost=15840.31..15870.81 rows=1220 width=8) (actual
time=56358.93..56452.78 rows=6670 loops=1) ->  Sort  (cost=15840.31..15840.31 rows=12203 width=8) (actual
time=56358.91..56436.95 rows=12062 loops=1)       ->  Seq Scan on send_total  (cost=0.00..15012.03 rows=12203
width=8) (actual time=0.55..56205.25 rows=12062 loops=1)
Total runtime: 56518.26 msec


This query is becoming slower and slower, day by day.
What i have to do to optimize my query or database??? I've this database
working since January 2002.

I've tried to make the following command, but without success:

$ /usr/bin/vacuumdb -z -d dbname -t send_total



Thanks in advance,


M.P.Garcia





--
M.P.Garcia
PT Inovação, SA
Serviços e Redes Móveis
Rua José Ferreira Pinto Basto - 3810 Aveiro
Tel: 234 403 253  -  Fax: 234 424 160
E-mail: marcos-p-garcia@ptinovacao.pt

Re: Slow SELECT with distinct, in a TIMESTAMP type column

From

Stephan Szabo

Date:

24 June 2002, 18:50:22

On 24 Jun 2002, Marcos Garcia wrote:

>
> When i make the set enable_seqscan = off the result is:
>
> dbname=# set enable_seqscan = off;
> SET VARIABLE
> dbname=# explain analyze SELECT distinct(smsdate) FROM send_total;
>
> NOTICE:  QUERY PLAN:
>
> Unique  (cost=0.00..31084.39 rows=1232 width=8) (actual
> time=108.60..195210.91 rows=6676 loops=1)
>   ->  Index Scan using send_total_smsdate_idx on send_total
> (cost=0.00..31053.60 rows=12315 width=8) (actual time=108.59..195145.43
> rows=12087 loops=1)
> Total runtime: 195226.24 msec

Well the plan choice makes sense (since it's about 3.5x faster as
a sequence scan over the index scan).
It looks like most of the time is taken up by the actual scan (is
"select smsdate from send_total" pretty much as slow?).

Re: Slow SELECT with distinct, in a TIMESTAMP type column

From

Stephan Szabo

Date:

24 June 2002, 18:56:55

On 24 Jun 2002, Marcos Garcia wrote:

> The problem is that my query is too slow when i use distinct:
>
> pgsql> explain analyze SELECT distinct(smsdate) FROM send_total;
>
> NOTICE:  QUERY PLAN:
>
> Unique  (cost=15840.31..15870.81 rows=1220 width=8) (actual
> time=56358.93..56452.78 rows=6670 loops=1)
>   ->  Sort  (cost=15840.31..15840.31 rows=12203 width=8) (actual
> time=56358.91..56436.95 rows=12062 loops=1)
>         ->  Seq Scan on send_total  (cost=0.00..15012.03 rows=12203
> width=8) (actual time=0.55..56205.25 rows=12062 loops=1)
> Total runtime: 56518.26 msec

As a question, what does explain analyze show if you
set enable_seqscan=off; first?

Re: Slow SELECT with distinct, in a TIMESTAMP type column

From

Marcos Garcia

Date:

24 June 2002, 19:28:03

When i make the set enable_seqscan = off the result is:

dbname=# set enable_seqscan = off;
SET VARIABLE
dbname=# explain analyze SELECT distinct(smsdate) FROM send_total;

NOTICE:  QUERY PLAN:

Unique  (cost=0.00..31084.39 rows=1232 width=8) (actual
time=108.60..195210.91 rows=6676 loops=1) ->  Index Scan using send_total_smsdate_idx on send_total
(cost=0.00..31053.60 rows=12315 width=8) (actual time=108.59..195145.43
rows=12087 loops=1)
Total runtime: 195226.24 msec




On Mon, 2002-06-24 at 19:36, Stephan Szabo wrote:
>
> On 24 Jun 2002, Marcos Garcia wrote:
>
> > The problem is that my query is too slow when i use distinct:
> >
> > pgsql> explain analyze SELECT distinct(smsdate) FROM send_total;
> >
> > NOTICE:  QUERY PLAN:
> >
> > Unique  (cost=15840.31..15870.81 rows=1220 width=8) (actual
> > time=56358.93..56452.78 rows=6670 loops=1)
> >   ->  Sort  (cost=15840.31..15840.31 rows=12203 width=8) (actual
> > time=56358.91..56436.95 rows=12062 loops=1)
> >         ->  Seq Scan on send_total  (cost=0.00..15012.03 rows=12203
> > width=8) (actual time=0.55..56205.25 rows=12062 loops=1)
> > Total runtime: 56518.26 msec
>
> As a question, what does explain analyze show if you
> set enable_seqscan=off; first?
>
--
M.P.Garcia
PT Inovação, SA
Serviços e Redes Móveis
Rua José Ferreira Pinto Basto - 3810 Aveiro
Tel: 234 403 253  -  Fax: 234 424 160
E-mail: marcos-p-garcia@ptinovacao.pt