ORDER BY Optimization - Mailing list pgsql-sql
From | Derek Buttineau|Compu-SOLVE |
---|---|
Subject | ORDER BY Optimization |
Date | |
Msg-id | 427BCA10.4090208@csolve.net Whole thread Raw |
Responses |
Re: ORDER BY Optimization
|
List | pgsql-sql |
Good Day, I'm hoping this is the right place to send this. I have a query that's causing me some grief trying to optimize it. The query cost is fine until I order the data set. Mind you after it's been ran and cached, subsequent calls to it are near instant. The Query in question is: select mr.*,m.* from maillog_received mr JOIN maillog m ON mr.maillog_id = m.id WHERE mr.subscription=89 and m.spam=1 ORDER BY m.msg_date desc limit 10 The strucutre of the tables involved in the query are as follows: Table "public.maillog_received" Column | Type | Modifiers --------------+------------------------+------------------------------------------------------------------id |integer | not null default nextval('public.maillog_received_id_seq'::text)maillog_id | bigint | not nullsubscription | integer | not nulllocal_part | character varying(255) | not nulldomain | character varying(255) | not null Indexes: "maillog_received_pkey" PRIMARY KEY, btree (id) "maillog_received_subscription_idx" btree (subscription) "maillog_received_subscription_maillog_id_idx"btree (subscription, maillog_id) Foreign-key constraints: "$1" FOREIGN KEY (subscription) REFERENCES subscriptions(id) ON DELETE CASCADE "$2" FOREIGN KEY (maillog_id) REFERENCES maillog(id) ON DELETE CASCADE Triggers: checkit BEFORE INSERT OR UPDATE ON maillog_received FOR EACH ROW EXECUTE PROCEDURE check_sub() Table "public.maillog" Column | Type | Modifiers -----------------+-----------------------------+---------------------------------------------------------id | bigint | not null default nextval('public.maillog_id_seq'::text)message_id | character varying(16) |msg_date | timestamp withouttime zone | not nullfrom_local_part | character varying(255) | not nullfrom_domain | character varying(255) | not nullfrom_ip | character varying(128) |from_host | character varying(255) |subject | character varying(255) |virus | integer | not null default 0spam | integer | not null default 0list | integer | not null default0bad_recipient | integer | not null default 0bad_sender | integer |not null default 0bad_relay | integer | not null default 0bad_file | integer | not null default 0bad_mime | integer | not null default 0sascore | numeric(7,2) |sareport | text |vscanreport | text |contentreport | text |bypassed | integer | not null default 0delivered | integer | not null default 1complete | integer | not nulldefault 1 Indexes: "maillog_pkey" PRIMARY KEY, btree (id) "maillog_msg_date_idx" btree (msg_date) EXPLAIN ANALYZE gives me the following: Limit (cost=31402.85..31402.87 rows=10 width=306) (actual time=87454.203..87454.334 rows=10 loops=1) -> Sort (cost=31402.85..31405.06 rows=886 width=306) (actual time=87454.187..87454.240 rows=10 loops=1) Sort Key: m.msg_date -> Nested Loop (cost=0.00..31359.47 rows=886width=306) (actual time=4.740..86430.468 rows=26308 loops=1) -> Index Scan using maillog_received_subscription_idx on maillog_received mr (cost=0.00..17789.73 rows=4479 width=43) (actual time=0.030..33554.061 rows=65508 loops=1) Index Cond: (subscription = 89) -> Index Scanusing maillog_pkey on maillog m (cost=0.00..3.02 rows=1 width=263) (actual time=0.776..0.780 rows=0 loops=65508) Index Cond: ("outer".maillog_id = m.id) Filter: (spam = 1)Total runtime:87478.068 ms Now there is a lot of data in these tables, at least a few million records, but I'd hoped to get a bit better performance :) Now another odd thing I will mention, if I take the database schema to a second server (both running postgres 8.0.2 on FreeBSD 5.3), I get a much different (and to me, it looks much more effecient) query plan (though it's with substantially less data (about 500,000 records in the table)): Limit (cost=0.00..6482.60 rows=10 width=311) (actual time=25.340..26.885 rows=10 loops=1) -> Nested Loop (cost=0.00..1175943.99 rows=1814 width=311) (actual time=25.337..26.867 rows=10 loops=1) -> Index Scan Backward using maillog_msg_date_idx on maillog m (cost=0.00..869203.93 rows=51395 width=270) (actual time=25.156..26.050 rows=48 loops=1) Filter: (spam = 1) -> Index Scan using maillog_received_subscription_maillog_id_idx on maillog_received mr (cost=0.00..5.96 rows=1 width=41) (actual time=0.011..0.012 rows=0 loops=48) Index Cond: ((mr.subscription =89) AND (mr.maillog_id = "outer".id))Total runtime: 27.016 ms Any suggestions? -- Regards, Derek Buttineau Internet Systems Developer Compu-SOLVE Internet Services Compu-SOLVE Technologies Inc. 705.725.1212 x255