Re: Lot'sa joins - performance tip-up, please? - Mailing list pgsql-performance
From | Mario Splivalo |
---|---|
Subject | Re: Lot'sa joins - performance tip-up, please? |
Date | |
Msg-id | 1146753957.8538.30.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Lot'sa joins - performance tip-up, please? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Lot'sa joins - performance tip-up, please?
|
List | pgsql-performance |
On Wed, 2006-05-03 at 13:58 -0400, Tom Lane wrote: > Mario Splivalo <msplival@jagor.srce.hr> writes: > > I have a quite large query that takes over a minute to run on my laptop. > > The EXPLAIN output you provided doesn't seem to agree with the stated > query. Where'd the "service_id = 1102" condition come from? I guess I copypasted the additional WHERE to te EXPLAIN ANALYZE query. This is the correct one, without the WHERE: Hash Left Join (cost=198628.35..202770.61 rows=121 width=264) (actual time=998008.264..999645.322 rows=5706 loops=1) Hash Cond: ("outer".message_id = "inner".message_id) -> Merge Left Join (cost=21943.23..21950.96 rows=121 width=238) (actual time=4375.510..4540.772 rows=5706 loops=1) Merge Cond: ("outer".message_id = "inner".message_id) -> Sort (cost=21847.62..21847.92 rows=121 width=230) (actual time=3304.787..3378.515 rows=5706 loops=1) Sort Key: messages.id -> Hash Join (cost=20250.16..21843.43 rows=121 width=230) (actual time=1617.370..3102.470 rows=5706 loops=1) Hash Cond: ("outer".message_id = "inner".id) -> Seq Scan on ticketing_messages (cost=0.00..1212.37 rows=75937 width=14) (actual time=10.554..609.967 rows=75937 loops=1) -> Hash (cost=20244.19..20244.19 rows=2391 width=216) (actual time=1572.889..1572.889 rows=5706 loops=1) -> Nested Loop (cost=1519.21..20244.19 rows=2391 width=216) (actual time=385.582..1449.207 rows=5706 loops=1) -> Seq Scan on services (cost=0.00..4.20 rows=3 width=54) (actual time=20.829..20.859 rows=2 loops=1) Filter: (type_id = 10) -> Bitmap Heap Scan on messages (cost=1519.21..6726.74 rows=1594 width=162) (actual time=182.346..678.800 rows=2853 loops=2) Recheck Cond: (("outer".id = messages.service_id) AND (messages.receiving_time >= '2006-02-12 00:00:00+01'::timestamp with time zone) AND (messages.receiving_time <= '2006-03-18 23:00:00+01'::timestamp with time zone)) -> BitmapAnd (cost=1519.21..1519.21 rows=1594 width=0) (actual time=164.311..164.311 rows=0 loops=2) -> Bitmap Index Scan on idx_service_id (cost=0.00..84.10 rows=14599 width=0) (actual time=66.809..66.809 rows=37968 loops=2) Index Cond: ("outer".id = messages.service_id) -> Bitmap Index Scan on idx_messages_receiving_time (cost=0.00..1434.87 rows=164144 width=0) (actual time=192.633..192.633 rows=184741 loops=1) Index Cond: ((receiving_time >= '2006-02-12 00:00:00+01'::timestamp with time zone) AND (receiving_time <= '2006-03-18 23:00:00+01'::timestamp with time zone)) -> Sort (cost=95.62..99.17 rows=1421 width=8) (actual time=1070.678..1072.999 rows=482 loops=1) Sort Key: ticketing_winners.message_id -> Seq Scan on ticketing_winners (cost=0.00..21.21 rows=1421 width=8) (actual time=424.836..1061.834 rows=1421 loops=1) -> Hash (cost=176144.30..176144.30 rows=57925 width=26) (actual time=993592.980..993592.980 rows=57925 loops=1) -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) (actual time=1074.984..992536.243 rows=57925 loops=1) -> Seq Scan on ticketing_codes_played (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993 rows=57925 loops=1) -> Index Scan using ticketing_codes_pk on ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual time=17.044..17.052 rows=1 loops=57925) Index Cond: (ticketing_codes.code_id = "outer".code_id) Total runtime: 999778.981 ms > In general, I'd suggest playing around with the join order. Existing > releases of PG tend to throw up their hands when faced with a mixture of > outer joins and regular joins, and just join the tables in the order > listed. 8.2 will be smarter about this, but for now you have to do it > by hand ... No luck for me there. But, I found out that if I first do join on ticketing_codes and ticketing_codes_played, put the result to temporary table, and then join that temporary table with the rest of the query (the SELECT that is in parenthesis is transfered to a temporary table) the query is almost twice as fast. As mentioned before, ticketing_codes has 11000000 records. Mario P.S. Is it just me, or posting to psql-perofrmance is laged, quite a bit?
pgsql-performance by date: