Slow SELECT - Mailing list pgsql-general
From | psql-mail@freeuk.com |
---|---|
Subject | Slow SELECT |
Date | |
Msg-id | E1A50Io-000KGI-Lc@buckaroo.freeuk.net Whole thread Raw |
Responses |
Re: Slow SELECT
|
List | pgsql-general |
I am running a SELECT to get all tuples within a given date range. This query is much slwoer than i expected - am i missing something? I have a table 'meta' with a column 'in_date' of type timestamp(0), i am trying to select all records within a given date range. I have an index on 'in_date' and I also have an index on date(in_date). The queries I am doing are between dates rather than timestamps. Immeadiately prior to running the queries shown below a VACUUM ANALYZE was run. The query planner seems to be out by a factor of 10 for cost and number of rows. Is this this because of the slow performance? I have 6 million records. With dates spread fairly evenly between the end of 2001 and now. I was very suprised to see the query take over 20 minutes when using the date(in_date) index. And more suprised to see the seq_scan over in_ date using timestamps take only 10 minutes. Both are taking too long in my opinion! I was hoping for less than 10 seconds. Is this too optimistic? Any suggestions much appreciated. I am using RH_AS_3 on IBM x450 quad xeon ia64, 4GB mem (1GB shared buffers for postmaster) When running queries the processor its running on sits down at 15-20% usage and the iowait goes up to 80-99% (fiber attached raid(0) yes i know its not resiliant). testdb=# EXPLAIN ANALYZE SELECT item_id, in_date FROM meta WHERE date( in_date) >= '2002-03-01' AND date(in_date) < '2002-04-01' order by in_ date DESC; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------ Sort (cost=122755.65..122830.64 rows=29996 width=50) (actual time= 1248326.17..1248608.39 rows=261305 loops=1) Sort Key: in_date -> Index Scan using meta_in_date_date_index on meta (cost=0.00.. 120525.09 rows=29996 width=50) (actual time=0.00..1244835.94 rows= 261305 loops=1) Index Cond: ((date(in_date) >= '2002-03-01'::date) AND (date(in_date) < '2002-04-01'::date)) Total runtime: 1248887.70 msec (5 rows) Here are the stats on the in_date column if they're any use... testdb=# SELECT * FROM pg_stats WHERE tablename = 'meta' and attname = ' in_date'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ------------+-----------+----------+-----------+-----------+------------ +------------------+-------------------+-------------------------------- ------------------------------------------------------------------------ ------------------------------------------------------------------------ ---------------------------------------------------------------------+-- ------------ public | meta | in_date | 0 | 8 | -1 | | | {"2001-10-18 17:28:23","2001-12-28 19:31:06"," 2002-03-14 19:59:08","2002-05-27 08:28:04","2002-07-31 14:06:06","2002- 10-09 19:09:49","2002-12-21 03:58:46","2003-03-02 21:41:37","2003-05-09 16:12:39","2003-07-22 05:13:18","2003-09-30 13:48:04"} | -0.000184019 (1 row) Here is the same query as above but using timestamp(0)'s instead of dates. testdb=# EXPLAIN ANALYZE SELECT item_id, in_date FROM meta WHERE in_ date >= '2002-03-01' AND in_date < '2002-04-01' order by in_date DESC; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ -------- Sort (cost=797371.98..797995.09 rows=249246 width=50) (actual time= 616906.25..617183.58 rows=261305 loops=1) Sort Key: in_date -> Seq Scan on meta (cost=0.00..775030.55 rows=249246 width=50) ( actual time=19.53..611541.03 rows=261305 loops=1) Filter: ((in_date >= '2002-03-01 00:00:00'::timestamp without time zone) AND (in_date < '2002-04-01 00:00:00'::timestamp without time zone)) Total runtime: 617446.29 msec (5 rows) --
pgsql-general by date: