Slow SELECT - Mailing list pgsql-general

From psql-mail@freeuk.com
Subject Slow SELECT
Date
Msg-id E1A50Io-000KGI-Lc@buckaroo.freeuk.net
Whole thread Raw
Responses Re: Slow SELECT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I am running a SELECT to get all tuples within a given date range. This
query is much slwoer than i expected - am i missing something?

I have a table 'meta' with a column 'in_date' of type timestamp(0), i
am trying to select all
records within a given date range. I have an index on 'in_date' and I
also have an index on date(in_date). The queries I am doing are between
dates rather than timestamps.

Immeadiately prior to running the queries shown below a VACUUM ANALYZE
was run.

The query planner seems to be out by a factor of 10 for cost and number
of rows. Is this this because of the slow performance?

I have 6 million records. With dates spread fairly evenly between the
end of 2001 and now.

I was very suprised to see the query take over 20 minutes when using
the date(in_date) index. And more suprised to see the seq_scan over in_
date using timestamps take only 10 minutes.

Both are taking too long in my opinion! I was hoping for less than 10
seconds. Is this too optimistic?

Any suggestions much appreciated.

I am using RH_AS_3 on IBM x450 quad xeon ia64, 4GB mem (1GB shared
buffers for
postmaster)
When running queries the processor its running on sits down at 15-20%
usage and the iowait goes up to 80-99% (fiber attached raid(0) yes i
know its not resiliant).

testdb=# EXPLAIN ANALYZE  SELECT item_id, in_date FROM meta WHERE date(
in_date) >= '2002-03-01' AND date(in_date) < '2002-04-01' order by in_
date DESC;
    QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------
Sort  (cost=122755.65..122830.64 rows=29996 width=50) (actual time=
1248326.17..1248608.39 rows=261305 loops=1)
Sort Key: in_date
->  Index Scan using meta_in_date_date_index on meta  (cost=0.00..
120525.09 rows=29996 width=50) (actual time=0.00..1244835.94 rows=
261305 loops=1)
Index Cond: ((date(in_date) >= '2002-03-01'::date) AND (date(in_date) <
'2002-04-01'::date))
Total runtime: 1248887.70 msec
(5 rows)


Here are the stats on the in_date column if they're any use...

testdb=# SELECT * FROM pg_stats WHERE tablename = 'meta' and attname = '
in_date';
schemaname | tablename | attname  | null_frac | avg_width | n_distinct |
most_common_vals | most_common_freqs |
histogram_bounds                                              |
correlation
------------+-----------+----------+-----------+-----------+------------
+------------------+-------------------+--------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
---------------------------------------------------------------------+--
------------
public     | meta | in_date |         0 |         8 |         -1 |
|                   | {"2001-10-18 17:28:23","2001-12-28 19:31:06","
2002-03-14 19:59:08","2002-05-27 08:28:04","2002-07-31 14:06:06","2002-
10-09 19:09:49","2002-12-21 03:58:46","2003-03-02 21:41:37","2003-05-09
16:12:39","2003-07-22 05:13:18","2003-09-30 13:48:04"} | -0.000184019
(1 row)


Here is the same query as above but using timestamp(0)'s instead of
dates.

testdb=# EXPLAIN ANALYZE  SELECT item_id, in_date FROM meta WHERE in_
date >= '2002-03-01' AND in_date < '2002-04-01' order by in_date DESC;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------
Sort  (cost=797371.98..797995.09 rows=249246 width=50) (actual time=
616906.25..617183.58 rows=261305 loops=1)
Sort Key: in_date
->  Seq Scan on meta  (cost=0.00..775030.55 rows=249246 width=50) (
actual time=19.53..611541.03 rows=261305 loops=1)
Filter: ((in_date >= '2002-03-01 00:00:00'::timestamp without time zone)
AND (in_date < '2002-04-01 00:00:00'::timestamp without time zone))
Total runtime: 617446.29 msec
(5 rows)


--

pgsql-general by date:

Previous
From: "Fabio Benavides Murillo"
Date:
Subject: migrate from postgres to mysql
Next
From: Ian Harding
Date:
Subject: Query FKey Constraint Table and Column Names