Thread: Queries with timestamp II
Hi all, First of all thanks to Josh and Richard for their replies. What I have done to test their indications is the following. I have created a new table identical to STATISTICS, and an index over the TIMESTAMP_IN field. CREATE TABLE STATISTICS2 ( STATISTIC_ID NUMERIC(10) NOT NULL DEFAULT NEXTVAL('STATISTIC_ID_SEQ') CONSTRAINT pk_st_statistic2_id PRIMARY KEY, TIMESTAMP_IN TIMESTAMP, VALUE NUMERIC(10) ); CREATE INDEX i_stats2_tin ON STATISTICS2(TIMESTAMP_IN); After that I inserted the data from STATISTICS and vacuumed the DB: INSERT INTO STATISTICS2 ( SELECT * FROM STATISTICS ); vacuumdb -f -z -d test once the vacuum has finished I do the following query explain analyze select * from statistics2 where timestamp_in < to_timestamp( '20031201', 'YYYYMMDD' ); NOTICE: QUERY PLAN: Seq Scan on statistics2 (cost=0.00..638.00 rows=9289 width=35) (actual time=0.41..688.34 rows=27867 loops=1) Total runtime: 730.82 msec That query is not using the index. Anybody knows what I'm doing wrong? Thank you very much -- Arnau
Dnia 2004-01-26 15:12, Użytkownik Arnau napisał: > Hi all, > > First of all thanks to Josh and Richard for their replies. What I have > done to test > their indications is the following. I have created a new table identical to > STATISTICS, > and an index over the TIMESTAMP_IN field. > > CREATE TABLE STATISTICS2 > ( > STATISTIC_ID NUMERIC(10) NOT NULL DEFAULT > NEXTVAL('STATISTIC_ID_SEQ') > CONSTRAINT pk_st_statistic2_id PRIMARY KEY, > TIMESTAMP_IN TIMESTAMP, > VALUE NUMERIC(10) > ); Do you really have to use numeric as primary key? Integer datatypes (int4/int8) are much faster than numeric. > > CREATE INDEX i_stats2_tin ON STATISTICS2(TIMESTAMP_IN); > > After that I inserted the data from STATISTICS and vacuumed the DB: > > INSERT INTO STATISTICS2 ( SELECT * FROM STATISTICS ); > vacuumdb -f -z -d test > > once the vacuum has finished I do the following query > > explain analyze select * from statistics2 where timestamp_in < > to_timestamp( '20031201', 'YYYYMMDD' ); > NOTICE: QUERY PLAN: > > Seq Scan on statistics2 (cost=0.00..638.00 rows=9289 width=35) (actual > time=0.41..688.34 rows=27867 loops=1) > Total runtime: 730.82 msec > > That query is not using the index. Anybody knows what I'm doing wrong? Over 25000 rows match your condition: timestamp_in < to_timestamp( '20031201', 'YYYYMMDD' ); How many rows do you have in your table? It's possible, that seq scan is just faster than using index when getting so many rows output. Regards, Tomasz Myrta
Arnau wrote: > explain analyze select * from statistics2 where timestamp_in < > to_timestamp( '20031201', 'YYYYMMDD' ); > NOTICE: QUERY PLAN: > > Seq Scan on statistics2 (cost=0.00..638.00 rows=9289 width=35) (actual > time=0.41..688.34 rows=27867 loops=1) > Total runtime: 730.82 msec > > That query is not using the index. Anybody knows what I'm doing wrong? Since it expects large number of rows will be returned, it is favouring sequential scan. Given how the estimates have differed, i.e. estimated 9289 v/s actual 27867, I sugest you up the statistics for the table using alter table. Check http://www.postgresql.org/docs/current/static/sql-altertable.html HTH Shridhar