Thread: Queries with timestamp II

Queries with timestamp II

From
"Arnau"
Date:
Hi all,

  First of all thanks to Josh and Richard for their replies. What I have
done to test
their indications is the following. I have created a new table identical to
STATISTICS,
and an index over the TIMESTAMP_IN field.

CREATE TABLE STATISTICS2
(
  STATISTIC_ID    NUMERIC(10) NOT NULL DEFAULT
                  NEXTVAL('STATISTIC_ID_SEQ')
                  CONSTRAINT pk_st_statistic2_id PRIMARY KEY,
  TIMESTAMP_IN    TIMESTAMP,
  VALUE           NUMERIC(10)
);

CREATE INDEX i_stats2_tin ON STATISTICS2(TIMESTAMP_IN);

After that I inserted the data from STATISTICS and vacuumed the DB:

  INSERT INTO STATISTICS2 ( SELECT * FROM STATISTICS );
  vacuumdb -f -z -d test

once the vacuum has finished I do the following query

explain analyze select * from statistics2 where timestamp_in <
to_timestamp( '20031201', 'YYYYMMDD' );
NOTICE:  QUERY PLAN:

Seq Scan on statistics2  (cost=0.00..638.00 rows=9289 width=35) (actual
time=0.41..688.34 rows=27867 loops=1)
Total runtime: 730.82 msec

That query is not using the index. Anybody knows what I'm doing wrong?

Thank you very much

--
Arnau



Re: Queries with timestamp II

From
Tomasz Myrta
Date:
Dnia 2004-01-26 15:12, Użytkownik Arnau napisał:
> Hi all,
>
>   First of all thanks to Josh and Richard for their replies. What I have
> done to test
> their indications is the following. I have created a new table identical to
> STATISTICS,
> and an index over the TIMESTAMP_IN field.
>
> CREATE TABLE STATISTICS2
> (
>   STATISTIC_ID    NUMERIC(10) NOT NULL DEFAULT
>                   NEXTVAL('STATISTIC_ID_SEQ')
>                   CONSTRAINT pk_st_statistic2_id PRIMARY KEY,
>   TIMESTAMP_IN    TIMESTAMP,
>   VALUE           NUMERIC(10)
> );

Do you really have to use numeric as primary key? Integer datatypes
(int4/int8) are much faster than numeric.

>
> CREATE INDEX i_stats2_tin ON STATISTICS2(TIMESTAMP_IN);
>
> After that I inserted the data from STATISTICS and vacuumed the DB:
>
>   INSERT INTO STATISTICS2 ( SELECT * FROM STATISTICS );
>   vacuumdb -f -z -d test
>
> once the vacuum has finished I do the following query
>
> explain analyze select * from statistics2 where timestamp_in <
> to_timestamp( '20031201', 'YYYYMMDD' );
> NOTICE:  QUERY PLAN:
>
> Seq Scan on statistics2  (cost=0.00..638.00 rows=9289 width=35) (actual
> time=0.41..688.34 rows=27867 loops=1)
> Total runtime: 730.82 msec
>
> That query is not using the index. Anybody knows what I'm doing wrong?

Over 25000 rows match your condition:
timestamp_in < to_timestamp( '20031201', 'YYYYMMDD' );

How many rows do you have in your table? It's possible, that seq scan is
just faster than using index when getting so many rows output.

Regards,
Tomasz Myrta

Re: Queries with timestamp II

From
Shridhar Daithankar
Date:
Arnau wrote:

> explain analyze select * from statistics2 where timestamp_in <
> to_timestamp( '20031201', 'YYYYMMDD' );
> NOTICE:  QUERY PLAN:
>
> Seq Scan on statistics2  (cost=0.00..638.00 rows=9289 width=35) (actual
> time=0.41..688.34 rows=27867 loops=1)
> Total runtime: 730.82 msec
>
> That query is not using the index. Anybody knows what I'm doing wrong?

Since it expects large number of rows will be returned, it is favouring
sequential scan.

Given how the estimates have differed, i.e. estimated 9289 v/s actual 27867, I
sugest you up the statistics for the table using alter table. Check
http://www.postgresql.org/docs/current/static/sql-altertable.html


HTH

  Shridhar