Thread: Lot'sa joins - performance tip-up, please?
I have a quite large query that takes over a minute to run on my laptop. On the db server it takes olmost 20 seconds, but I have 200+ concurent users who will be running similair querries, and during the query the I/O goes bezerk, I read 30MB/s reading (iostat tells so). So, before going into denormalization, I wonder if I could do something to speed things up. The query is like this: select * from messages join services on services.id = messages.service_id join ticketing_messages on messages.id = ticketing_messages.message_id left join ticketing_winners on ticketing_winners.message_id = ticketing_messages.message_id left join ( select * from ticketing_codes_played join ticketing_codes on ticketing_codes.code_id = ticketing_codes_played.code_id ) as codes on codes.message_id = ticketing_messages.message_id where services.type_id = 10 and messages.receiving_time between '2006-02-12' and '2006-03-18 23:00:00'; The explain analyze of the above produces this: QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Merge Left Join (cost=221692.04..222029.29 rows=3772 width=264) (actual time=539169.163..541579.504 rows=75937 loops=1) Merge Cond: ("outer".message_id = "inner".message_id) -> Sort (cost=40080.17..40089.60 rows=3772 width=238) (actual time=8839.072..9723.371 rows=75937 loops=1) Sort Key: messages.id -> Hash Left Join (cost=2259.40..39856.10 rows=3772 width=238) (actual time=1457.451..7870.830 rows=75937 loops=1) Hash Cond: ("outer".message_id = "inner".message_id) -> Nested Loop (cost=2234.64..39811.76 rows=3772 width=230) (actual time=1418.911..7063.299 rows=75937 loops=1) -> Index Scan using pk_services on services (cost=0.00..4.46 rows=1 width=54) (actual time=28.261..28.271 rows=1 loops=1) Index Cond: (1102 = id) Filter: (type_id = 10) -> Hash Join (cost=2234.64..39769.58 rows=3772 width=176) (actual time=1390.621..6297.501 rows=75937 loops=1) Hash Cond: ("outer".id = "inner".message_id) -> Bitmap Heap Scan on messages (cost=424.43..32909.53 rows=74408 width=162) (actual time=159.796..4329.125 rows=75937 loops=1) Recheck Cond: (service_id = 1102) -> Bitmap Index Scan on idx_service_id (cost=0.00..424.43 rows=74408 width=0) (actual time=95.197..95.197 rows=75937 loops=1) Index Cond: (service_id = 1102) -> Hash (cost=1212.37..1212.37 rows=75937 width=14) (actual time=940.372..940.372 rows=75937 loops=1) -> Seq Scan on ticketing_messages (cost=0.00..1212.37 rows=75937 width=14) (actual time=12.122..461.960 rows=75937 loops=1) -> Hash (cost=21.21..21.21 rows=1421 width=8) (actual time=38.496..38.496 rows=1421 loops=1) -> Seq Scan on ticketing_winners (cost=0.00..21.21 rows=1421 width=8) (actual time=24.534..31.347 rows=1421 loops=1) -> Sort (cost=181611.87..181756.68 rows=57925 width=26) (actual time=530330.060..530647.055 rows=57925 loops=1) Sort Key: ticketing_codes_played.message_id -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) (actual time=68.322..529472.026 rows=57925 loops=1) -> Seq Scan on ticketing_codes_played (cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881 rows=57925 loops=1) -> Index Scan using ticketing_codes_pk on ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual time=9.102..9.108 rows=1 loops=57925) Index Cond: (ticketing_codes.code_id = "outer".code_id) Total runtime: 542000.093 ms (27 rows) I'll be more than happy to provide any additional information that I may be able to gather. I'd be most happy if someone would scream something like "four joins, smells like a poor design" because design is poor, but the system is in production, and I have to bare with it. Mario -- "I can do it quick, I can do it cheap, I can do it well. Pick any two." Mario Splivalo msplival@jagor.srce.hr
> -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) > (actual time=68.322..529472.026 rows=57925 loops=1) > -> Seq Scan on ticketing_codes_played > (cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881 > rows=57925 loops=1) > -> Index Scan using ticketing_codes_pk on > ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual > time=9.102..9.108 rows=1 loops=57925) > Index Cond: (ticketing_codes.code_id = > "outer".code_id) > Total runtime: 542000.093 ms > (27 rows) > > > I'll be more than happy to provide any additional information > that I may > be able to gather. I'd be most happy if someone would scream something > like "four joins, smells like a poor design" because design > is poor, but > the system is in production, and I have to bare with it. It looks like that nested loop which is joining ticketing_codes_played to ticketing_codes is the slow part. I'm curious how many rows are in the ticketing_codes table? Four or five joins does not seem like a lot to me, but it can be slow if you are joining big tables with other big tables.
Mario Splivalo <msplival@jagor.srce.hr> writes: > I have a quite large query that takes over a minute to run on my laptop. The EXPLAIN output you provided doesn't seem to agree with the stated query. Where'd the "service_id = 1102" condition come from? In general, I'd suggest playing around with the join order. Existing releases of PG tend to throw up their hands when faced with a mixture of outer joins and regular joins, and just join the tables in the order listed. 8.2 will be smarter about this, but for now you have to do it by hand ... regards, tom lane
On Wed, 2006-05-03 at 10:20 -0500, Dave Dutcher wrote: > > -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) > > (actual time=68.322..529472.026 rows=57925 loops=1) > > -> Seq Scan on ticketing_codes_played > > (cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881 > > rows=57925 loops=1) > > -> Index Scan using ticketing_codes_pk on > > ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual > > time=9.102..9.108 rows=1 loops=57925) > > Index Cond: (ticketing_codes.code_id = > > "outer".code_id) > > Total runtime: 542000.093 ms > > (27 rows) > > > > > > I'll be more than happy to provide any additional information > > that I may > > be able to gather. I'd be most happy if someone would scream something > > like "four joins, smells like a poor design" because design > > is poor, but > > the system is in production, and I have to bare with it. > > > It looks like that nested loop which is joining ticketing_codes_played > to ticketing_codes is the slow part. I'm curious how many rows are in > the ticketing_codes table? > > Four or five joins does not seem like a lot to me, but it can be slow if > you are joining big tables with other big tables. Ticketing_codes table has 11000000 records, and it's expected to grow. I tried playing with JOIN order as Tom suggested, but performance is the same. Mario
On Wed, 2006-05-03 at 13:58 -0400, Tom Lane wrote: > Mario Splivalo <msplival@jagor.srce.hr> writes: > > I have a quite large query that takes over a minute to run on my laptop. > > The EXPLAIN output you provided doesn't seem to agree with the stated > query. Where'd the "service_id = 1102" condition come from? I guess I copypasted the additional WHERE to te EXPLAIN ANALYZE query. This is the correct one, without the WHERE: Hash Left Join (cost=198628.35..202770.61 rows=121 width=264) (actual time=998008.264..999645.322 rows=5706 loops=1) Hash Cond: ("outer".message_id = "inner".message_id) -> Merge Left Join (cost=21943.23..21950.96 rows=121 width=238) (actual time=4375.510..4540.772 rows=5706 loops=1) Merge Cond: ("outer".message_id = "inner".message_id) -> Sort (cost=21847.62..21847.92 rows=121 width=230) (actual time=3304.787..3378.515 rows=5706 loops=1) Sort Key: messages.id -> Hash Join (cost=20250.16..21843.43 rows=121 width=230) (actual time=1617.370..3102.470 rows=5706 loops=1) Hash Cond: ("outer".message_id = "inner".id) -> Seq Scan on ticketing_messages (cost=0.00..1212.37 rows=75937 width=14) (actual time=10.554..609.967 rows=75937 loops=1) -> Hash (cost=20244.19..20244.19 rows=2391 width=216) (actual time=1572.889..1572.889 rows=5706 loops=1) -> Nested Loop (cost=1519.21..20244.19 rows=2391 width=216) (actual time=385.582..1449.207 rows=5706 loops=1) -> Seq Scan on services (cost=0.00..4.20 rows=3 width=54) (actual time=20.829..20.859 rows=2 loops=1) Filter: (type_id = 10) -> Bitmap Heap Scan on messages (cost=1519.21..6726.74 rows=1594 width=162) (actual time=182.346..678.800 rows=2853 loops=2) Recheck Cond: (("outer".id = messages.service_id) AND (messages.receiving_time >= '2006-02-12 00:00:00+01'::timestamp with time zone) AND (messages.receiving_time <= '2006-03-18 23:00:00+01'::timestamp with time zone)) -> BitmapAnd (cost=1519.21..1519.21 rows=1594 width=0) (actual time=164.311..164.311 rows=0 loops=2) -> Bitmap Index Scan on idx_service_id (cost=0.00..84.10 rows=14599 width=0) (actual time=66.809..66.809 rows=37968 loops=2) Index Cond: ("outer".id = messages.service_id) -> Bitmap Index Scan on idx_messages_receiving_time (cost=0.00..1434.87 rows=164144 width=0) (actual time=192.633..192.633 rows=184741 loops=1) Index Cond: ((receiving_time >= '2006-02-12 00:00:00+01'::timestamp with time zone) AND (receiving_time <= '2006-03-18 23:00:00+01'::timestamp with time zone)) -> Sort (cost=95.62..99.17 rows=1421 width=8) (actual time=1070.678..1072.999 rows=482 loops=1) Sort Key: ticketing_winners.message_id -> Seq Scan on ticketing_winners (cost=0.00..21.21 rows=1421 width=8) (actual time=424.836..1061.834 rows=1421 loops=1) -> Hash (cost=176144.30..176144.30 rows=57925 width=26) (actual time=993592.980..993592.980 rows=57925 loops=1) -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) (actual time=1074.984..992536.243 rows=57925 loops=1) -> Seq Scan on ticketing_codes_played (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993 rows=57925 loops=1) -> Index Scan using ticketing_codes_pk on ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual time=17.044..17.052 rows=1 loops=57925) Index Cond: (ticketing_codes.code_id = "outer".code_id) Total runtime: 999778.981 ms > In general, I'd suggest playing around with the join order. Existing > releases of PG tend to throw up their hands when faced with a mixture of > outer joins and regular joins, and just join the tables in the order > listed. 8.2 will be smarter about this, but for now you have to do it > by hand ... No luck for me there. But, I found out that if I first do join on ticketing_codes and ticketing_codes_played, put the result to temporary table, and then join that temporary table with the rest of the query (the SELECT that is in parenthesis is transfered to a temporary table) the query is almost twice as fast. As mentioned before, ticketing_codes has 11000000 records. Mario P.S. Is it just me, or posting to psql-perofrmance is laged, quite a bit?
On Thu, May 04, 2006 at 04:45:57PM +0200, Mario Splivalo wrote: Well, here's the problem... > -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) > (actual time=1074.984..992536.243 rows=57925 loops=1) > -> Seq Scan on ticketing_codes_played > (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993 > rows=57925 loops=1) > -> Index Scan using ticketing_codes_pk on > ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual > time=17.044..17.052 rows=1 loops=57925) > Index Cond: (ticketing_codes.code_id = > "outer".code_id) Anyone have any idea why on earth it's doing that instead of a hash or merge join? In any case, try swapping the order of ticketing_codes_played and ticketing_codes. Actually, that'd probably make it worse. Try SET enable_nestloop = off; -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, 2006-05-10 at 17:10 -0500, Jim C. Nasby wrote: > On Thu, May 04, 2006 at 04:45:57PM +0200, Mario Splivalo wrote: > Well, here's the problem... > > > -> Nested Loop (cost=0.00..176144.30 rows=57925 width=26) > > (actual time=1074.984..992536.243 rows=57925 loops=1) > > -> Seq Scan on ticketing_codes_played > > (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993 > > rows=57925 loops=1) > > -> Index Scan using ticketing_codes_pk on > > ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual > > time=17.044..17.052 rows=1 loops=57925) > > Index Cond: (ticketing_codes.code_id = > > "outer".code_id) > > Anyone have any idea why on earth it's doing that instead of a hash or > merge join? > > In any case, try swapping the order of ticketing_codes_played and > ticketing_codes. Actually, that'd probably make it worse. I tried that, no luck. The best performance I achieve with creating temporary table. And... > > Try SET enable_nestloop = off; This helps also. I don't get sequential scans any more. I'd like a tip on how to set 'enable_nestloop = off' trough JDBC? Mario -- "I can do it quick, I can do it cheap, I can do it well. Pick any two." Mario Splivalo msplival@jagor.srce.hr
Hi, Mario, Mario Splivalo wrote: > This helps also. I don't get sequential scans any more. I'd like a tip > on how to set 'enable_nestloop = off' trough JDBC? statement.execute("SET enable_nestloop TO off"); should do. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org