Re: - Mailing list pgsql-novice

From Scott Morrison
Subject Re:
Date
Msg-id DHEKKBMAPBCGPOFEBIFFCEKECBAA.smorrison@navtechinc.com
Whole thread Raw
In response to Re:  (greg@turnstep.com)
Responses Re:
Re:
List pgsql-novice
Well I ran the queries through some more tests and the original query is
still faster than any of the suggested queries.

Query 1 (Original):
ANALYZE sample;
CREATE INDEX sample_id ON sample(id);
CREATE INDEX sample_date ON sample(date);
ANALYZE sample;
EXPLAIN ANALYZE
SELECT * FROM sample a WHERE (id,date) IN
    (SELECT id,max(date) FROM sample
     WHERE id=a.id AND date<='2003-02-07'
     GROUP BY id);
DROP INDEX sample_id;
DROP INDEX sample_date;

Query 2 (Greg):
ANALYZE sample;
CREATE INDEX sample_id ON sample(id);
CREATE INDEX sample_date ON sample(date);
ANALYZE sample;
EXPLAIN ANALYZE
SELECT * FROM sample a WHERE date<='2003-02-07' AND oid=
    (SELECT oid FROM sample WHERE id=a.id AND
    date<='2003-02-07' ORDER BY date DESC LIMIT 1);
DROP INDEX sample_id;
DROP INDEX sample_date;

As for Greg's query, I tried it without an id index and it took a long long
time (I broke it after about 5 minutes, so that's why the id index is added
in there).  I also took out the ORDER BY id because that slowed it down by
another 3 seconds or so, and I don't need it for what I'm querying out.
Below are the analysis reports for both queries.

Query 1 Analysis (Average time 21779.98ms):
Seq Scan on sample a  (cost=0.00..13702.91 rows=3126 width=12) (actual
time=58.67..21770.83 rows=99999 loops=1)
  SubPlan
    ->  Aggregate  (cost=0.00..3.74 rows=1 width=8) (actual time=0.11..0.11
rows=1 loops=180566)
          ->  Group  (cost=0.00..3.74 rows=1 width=8) (actual
time=0.06..0.10 rows=2 loops=180566)
                ->  Index Scan using sample_id on sample  (cost=0.00..3.73
rows=1 width=8) (actual time=0.05..0.08 rows=2 loops=180566)

Query 2 Analysis (Average time 28358.92ms):
Index Scan using sample_date on sample a  (cost=0.00..24294.52 rows=1
width=12) (actual time=0.42..27831.20 rows=99999 loops=1)
  SubPlan
    ->  Limit  (cost=3.84..3.84 rows=1 width=8) (actual time=0.15..0.16
rows=1 loops=160446)
          ->  Sort  (cost=3.84..3.84 rows=1 width=8) (actual time=0.15..0.15
rows=2 loops=160446)
                ->  Index Scan using sample_id on sample  (cost=0.00..3.83
rows=1 width=8) (actual time=0.05..0.07 rows=2 loops=160446)

So in conclusion, it doesn't look like I'll be able to get the query any
faster than my original query.  I was really hoping for something that would
run in 5-10% of the time, but I guess that's not possible.

Thanks for the suggestions.
-- Scott

-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org]On Behalf Of greg@turnstep.com
Sent: Monday, February 10, 2003 2:50 PM
To: pgsql-novice@postgresql.org
Subject: Re: [NOVICE]



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>>> Wouldn't there need to be an index on OID for your query to be fast?
>>>
>> No, the single index on the "date" field is enough. See the EXPLAIN
>> output from my first post in the original thread:
>>
> However the original poster suggested there would ultimately be 200,000
> or more rows in the table.

Yes, and my EXPLAIN is on a table with 200,000 rows in it. Hence the fact
that it took over 5 seconds - which is still very fast compared to the
original query.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302101447

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+SAHxvJuQZxSWSsgRAkxTAJ9TnEAjypMGbv8ZRn55yqe/8AZyQgCcCUPm
/dU6kHroxm1XpC0lUrvdY+Y=
=GwmT
-----END PGP SIGNATURE-----



---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


pgsql-novice by date:

Previous
From: greg@turnstep.com
Date:
Subject: Re:
Next
From: Robert Mosher
Date:
Subject: Characters To be Escaped in Perl