Re: - Mailing list pgsql-novice
From | Scott Morrison |
---|---|
Subject | Re: |
Date | |
Msg-id | DHEKKBMAPBCGPOFEBIFFCEKECBAA.smorrison@navtechinc.com Whole thread Raw |
In response to | Re: (greg@turnstep.com) |
Responses |
Re:
Re: |
List | pgsql-novice |
Well I ran the queries through some more tests and the original query is still faster than any of the suggested queries. Query 1 (Original): ANALYZE sample; CREATE INDEX sample_id ON sample(id); CREATE INDEX sample_date ON sample(date); ANALYZE sample; EXPLAIN ANALYZE SELECT * FROM sample a WHERE (id,date) IN (SELECT id,max(date) FROM sample WHERE id=a.id AND date<='2003-02-07' GROUP BY id); DROP INDEX sample_id; DROP INDEX sample_date; Query 2 (Greg): ANALYZE sample; CREATE INDEX sample_id ON sample(id); CREATE INDEX sample_date ON sample(date); ANALYZE sample; EXPLAIN ANALYZE SELECT * FROM sample a WHERE date<='2003-02-07' AND oid= (SELECT oid FROM sample WHERE id=a.id AND date<='2003-02-07' ORDER BY date DESC LIMIT 1); DROP INDEX sample_id; DROP INDEX sample_date; As for Greg's query, I tried it without an id index and it took a long long time (I broke it after about 5 minutes, so that's why the id index is added in there). I also took out the ORDER BY id because that slowed it down by another 3 seconds or so, and I don't need it for what I'm querying out. Below are the analysis reports for both queries. Query 1 Analysis (Average time 21779.98ms): Seq Scan on sample a (cost=0.00..13702.91 rows=3126 width=12) (actual time=58.67..21770.83 rows=99999 loops=1) SubPlan -> Aggregate (cost=0.00..3.74 rows=1 width=8) (actual time=0.11..0.11 rows=1 loops=180566) -> Group (cost=0.00..3.74 rows=1 width=8) (actual time=0.06..0.10 rows=2 loops=180566) -> Index Scan using sample_id on sample (cost=0.00..3.73 rows=1 width=8) (actual time=0.05..0.08 rows=2 loops=180566) Query 2 Analysis (Average time 28358.92ms): Index Scan using sample_date on sample a (cost=0.00..24294.52 rows=1 width=12) (actual time=0.42..27831.20 rows=99999 loops=1) SubPlan -> Limit (cost=3.84..3.84 rows=1 width=8) (actual time=0.15..0.16 rows=1 loops=160446) -> Sort (cost=3.84..3.84 rows=1 width=8) (actual time=0.15..0.15 rows=2 loops=160446) -> Index Scan using sample_id on sample (cost=0.00..3.83 rows=1 width=8) (actual time=0.05..0.07 rows=2 loops=160446) So in conclusion, it doesn't look like I'll be able to get the query any faster than my original query. I was really hoping for something that would run in 5-10% of the time, but I guess that's not possible. Thanks for the suggestions. -- Scott -----Original Message----- From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-owner@postgresql.org]On Behalf Of greg@turnstep.com Sent: Monday, February 10, 2003 2:50 PM To: pgsql-novice@postgresql.org Subject: Re: [NOVICE] -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>> Wouldn't there need to be an index on OID for your query to be fast? >>> >> No, the single index on the "date" field is enough. See the EXPLAIN >> output from my first post in the original thread: >> > However the original poster suggested there would ultimately be 200,000 > or more rows in the table. Yes, and my EXPLAIN is on a table with 200,000 rows in it. Hence the fact that it took over 5 seconds - which is still very fast compared to the original query. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200302101447 -----BEGIN PGP SIGNATURE----- Comment: http://www.turnstep.com/pgp.html iD8DBQE+SAHxvJuQZxSWSsgRAkxTAJ9TnEAjypMGbv8ZRn55yqe/8AZyQgCcCUPm /dU6kHroxm1XpC0lUrvdY+Y= =GwmT -----END PGP SIGNATURE----- ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
pgsql-novice by date: