Re: How can I speed up with query? - Mailing list pgsql-novice

From Greg Sabino Mullane
Subject Re: How can I speed up with query?
Date
Msg-id 8e2f64fc42bbf31e88e5fc5c796c62a1@biglumber.com
Whole thread Raw
In response to How can I speed up with query?  ("Scott Morrison" <smorrison@navtechinc.com>)
List pgsql-novice
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> I have a table (see below) with an ID, a date, and a value.  The date
> specifies when the entry was added.  I want to query the table to determine
> what the value was on a specific day.
> ...
> select a.* from sample a where (id,date) in (select a.id,max(date) from
> sample where date<='<<the date>>' and id=a.id) order by id;

Actually, your query as written wants to find the latest value added before or
on a certain date. A subtle difference, but it makes a difference. Some
quick notes before I jump into my shot at this:

* Avoid the max() function - it is not fully optimized yet. Instead use
SELECT foo FROM table ORDER BY foo DESC LIMIT 1
(and use ASC to replace the min() function)

* Try not to use keywords such as "date" for your column names.

* Always run VACUUM ANALYZE and create an index: in this case, on the "date" column

* If you can't match on a single column (as in the id,date from your original
query, use the oid)

(The explain analyze below is for a 200,000 row table with 20 distinct ids and a
time period of abot a month.)

VACUUM ANALYZE sample;
CREATE INDEX sample_date on sample(date);

EXPLAIN ANALYZE
SELECT id, date, value FROM sample a WHERE date <='2003-01-01' AND oid =
(SELECT oid FROM sample WHERE id = a.id AND DATE <='2003-01-01' ORDER BY date DESC LIMIT 1)
ORDER BY id;


                                                                    QUERY PLAN
- --------------------------------------------------------------------------------------------------
 Sort  (cost=1712.13..1712.14) (actual time=5292.35..5292.38 rows=20 loops=1)
   Sort Key: id
   ->  Index Scan using sample_date on sample a  (cost=0.00..1712.12) (actual time=5258.10..5292.22 rows=20 loops=1)
         Index Cond: (date <= '2003-01-01'::date)
         Filter: (oid = (subplan))
         SubPlan
           ->  Limit  (cost=0.00..81.53) (actual time=0.71..0.78 rows=1 loops=6532)
                 ->  Index Scan Backward using sample_date on sample  (cost=0.00..81.53) (actual time=0.70..0.78 rows=2
loops=6532)
                       Index Cond: (date <= '2003-01-01'::date)
                       Filter: (id = $0)
 Total runtime: 5292.50 msec



- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302051538
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+QXnMvJuQZxSWSsgRAhmDAKDwMmf0GvhnVFKeiDPnVolx3wOO1gCgsasJ
3t3LCOa6Q5uOCJpawodJO54=
=dGj1
-----END PGP SIGNATURE-----



pgsql-novice by date:

Previous
From: Mike Grommet
Date:
Subject: Re: how can I tell it's postgresql data?
Next
From: Andrew McMillan
Date:
Subject: Re: Postgres performace with large tables.