Query Optimization - Mailing list pgsql-performance

From Reinhard Vicinus
Subject Query Optimization
Date
Msg-id 45D9D85A.3090000@rjm.de
Whole thread Raw
Responses Re: Query Optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Query Optimization  ("Jim C. Nasby" <jim@nasby.net>)
List pgsql-performance
PostgreSQL version: 8.1.6
OS: Debian etch

The following query needs a lot of time because the query planner
reorders the joins:

select m.message_idnr, v.headervalue, n.headername from dbmail_messages m
  join dbmail_headervalue v ON v.physmessage_id=m.physmessage_id
  join dbmail_headername n ON v.headername_id=n.id
  where m.mailbox_idnr = 1022 AND message_idnr BETWEEN 698928 AND 1496874
    and lower(n.headername) IN
('from','to','cc','subject','date','message-id',
        'priority','x-priority','references','newsgroups','in-reply-to',
        'content-type','x-spam-status','x-spam-flag');

If I prevent the query planner from reordering the joins with 'set
join_collapse_limit=1;' then the same query is faster. At the end of the
Mail is the output of a explain analyze for both cases.

The statistics of the database are updated each night. Is there an error
(in the statistical data) which is responsible for the reordering of the
joins? And if not are there other alternatives for preventing join
reordering?

Thanks
Reinhard



Explain analyze with set join_collapse_limit=8:

 Merge Join  (cost=388657.62..391332.20 rows=821 width=127) (actual
time=82677.950..89103.192 rows=2699 loops=1)
   Merge Cond: ("outer".physmessage_id = "inner".physmessage_id)
   ->  Sort  (cost=2901.03..2902.61 rows=632 width=16) (actual
time=247.238..247.578 rows=373 loops=1)
         Sort Key: m.physmessage_id
         ->  Bitmap Heap Scan on dbmail_messages m  (cost=9.16..2871.63
rows=632 width=16) (actual time=38.072..246.509 rows=373 loops=1)
               Recheck Cond: (mailbox_idnr = 1022)
               Filter: ((message_idnr >= 698928) AND (message_idnr <=
1496874))
               ->  Bitmap Index Scan on dbmail_messages_8
(cost=0.00..9.16 rows=902 width=0) (actual time=25.561..25.561 rows=615
loops=1)
                     Index Cond: (mailbox_idnr = 1022)
   ->  Sort  (cost=385756.58..387089.35 rows=533108 width=127) (actual
time=80156.731..85760.186 rows=3278076 loops=1)
         Sort Key: v.physmessage_id
         ->  Hash Join  (cost=51.00..285787.17 rows=533108 width=127)
(actual time=34.519..28260.855 rows=3370242 loops=1)
               Hash Cond: ("outer".headername_id = "inner".id)
               ->  Seq Scan on dbmail_headervalue v
(cost=0.00..241200.39 rows=7840939 width=115) (actual
time=0.006..16844.479 rows=7854485 loops=1)
               ->  Hash  (cost=50.72..50.72 rows=113 width=28) (actual
time=34.493..34.493 rows=35 loops=1)
                     ->  Bitmap Heap Scan on dbmail_headername n
(cost=28.44..50.72 rows=113 width=28) (actual time=11.796..34.437
rows=35 loops=1)
                           Recheck Cond: ((lower((headername)::text) =
'from'::text) OR (lower((headername)::text) = 'to'::text) OR
(lower((headername)::text) = 'cc'::text) OR (lower((headername)::text) =
'subject'::text) OR (lower((headername)::text) = 'date'::text) OR
(lower((headername)::text) = 'message-id'::text) OR
(lower((headername)::text) = 'priority'::text) OR
(lower((headername)::text) = 'x-priority'::text) OR
(lower((headername)::text) = 'references'::text) OR
(lower((headername)::text) = 'newsgroups'::text) OR
(lower((headername)::text) = 'in-reply-to'::text) OR
(lower((headername)::text) = 'content-type'::text) OR
(lower((headername)::text) = 'x-spam-status'::text) OR (lower((hea
dername)::text) = 'x-spam-flag'::text))
                           ->  BitmapOr  (cost=28.44..28.44 rows=116
width=0) (actual time=11.786..11.786 rows=0 loops=1)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.037..0.037 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'from'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.013..0.013 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'to'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.013..0.013 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'cc'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.014..0.014 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'subject'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.014..0.014 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'date'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.019..0.019 rows=4 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'message-id'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.012..0.012 rows=2 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'priority'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.032..0.032 rows=4 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'x-priority'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.015..0.015 rows=1 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'references'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.009..0.009 rows=0 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'newsgroups'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.014..0.014 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'in-reply-to'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.013..0.013 rows=1 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'content-type'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=11.549..11.549 rows=2 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'x-spam-status'::text)
                                 ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.019..0.019 rows=3 loops=1)
                                       Index Cond:
(lower((headername)::text) = 'x-spam-flag'::text)
 Total runtime: 89277.937 ms
(47 rows)



Explain analyze with set join_collapse_limit=1:

 Hash Join  (cost=51.00..1607155.00 rows=821 width=127) (actual
time=14.640..47.851 rows=2699 loops=1)
   Hash Cond: ("outer".headername_id = "inner".id)
   ->  Nested Loop  (cost=0.00..1607035.43 rows=12071 width=115) (actual
time=0.085..25.057 rows=7025 loops=1)
         ->  Index Scan using dbmail_messages_mailbox_idx on
dbmail_messages m  (cost=0.00..3515.08 rows=632 width=16) (actual
time=0.064..1.070 rows=373 loops=1)
               Index Cond: (mailbox_idnr = 1022)
               Filter: ((message_idnr >= 698928) AND (message_idnr <=
1496874))
         ->  Index Scan using dbmail_headervalue_physmsg_id on
dbmail_headervalue v  (cost=0.00..2526.34 rows=870 width=115) (actual
time=0.010..0.035 rows=19 loops=373)
               Index Cond: (v.physmessage_id = "outer".physmessage_id)
   ->  Hash  (cost=50.72..50.72 rows=113 width=28) (actual
time=14.540..14.540 rows=35 loops=1)
         ->  Bitmap Heap Scan on dbmail_headername n  (cost=28.44..50.72
rows=113 width=28) (actual time=14.429..14.492 rows=35 loops=1)
               Recheck Cond: ((lower((headername)::text) = 'from'::text)
OR (lower((headername)::text) = 'to'::text) OR
(lower((headername)::text) = 'cc'::text) OR (lower((headername)::text) =
'subject'::text) OR (lower((headername)::text) = 'date'::text) OR
(lower((headername)::text) = 'message-id'::text) OR
(lower((headername)::text) = 'priority'::text) OR
(lower((headername)::text) = 'x-priority'::text) OR
(lower((headername)::text) = 'references'::text) OR
(lower((headername)::text) = 'newsgroups'::text) OR
(lower((headername)::text) = 'in-reply-to'::text) OR
(lower((headername)::text) = 'content-type'::text) OR
(lower((headername)::text) = 'x-spam-status'::text) OR
(lower((headername)::text) = 'x-spam-flag'::text))
               ->  BitmapOr  (cost=28.44..28.44 rows=116 width=0)
(actual time=14.418..14.418 rows=0 loops=1)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=14.197..14.197 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'from'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.015..0.015 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'to'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.012..0.012 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'cc'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.013..0.013 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'subject'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.014..0.014 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'date'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.019..0.019 rows=4 loops=1)
                           Index Cond: (lower((headername)::text) =
'message-id'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.011..0.011 rows=2 loops=1)
                           Index Cond: (lower((headername)::text) =
'priority'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.031..0.031 rows=4 loops=1)
                           Index Cond: (lower((headername)::text) =
'x-priority'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.013..0.013 rows=1 loops=1)
                           Index Cond: (lower((headername)::text) =
'references'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.008..0.008 rows=0 loops=1)
                           Index Cond: (lower((headername)::text) =
'newsgroups'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.014..0.014 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'in-reply-to'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.012..0.012 rows=1 loops=1)
                           Index Cond: (lower((headername)::text) =
'content-type'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.028..0.028 rows=2 loops=1)
                           Index Cond: (lower((headername)::text) =
'x-spam-status'::text)
                     ->  Bitmap Index Scan on
dbmail_headername_lower_headername  (cost=0.00..2.03 rows=8 width=0)
(actual time=0.018..0.018 rows=3 loops=1)
                           Index Cond: (lower((headername)::text) =
'x-spam-flag'::text)
 Total runtime: 49.634 ms
(41 rows)

--
Reinhard Vicinus
rjm business solutions GmbH
Sperlingweg 3,
68623 Lampertheim
Tel. 06206 9513084
Fax  06206 910315
--


pgsql-performance by date:

Previous
From: Andreas Tille
Date:
Subject: How to debug performance problems
Next
From: "Craig A. James"
Date:
Subject: Re: How to debug performance problems