Thread: Performance issues migrating from 743 to 826
Hi I am investigating migrating from postgres 743 to postgres 826 but although the performance in postgres 826 seems to be generally better there are some instances where it seems to be markedly worse, a factor of up to 10. The problem seems to occur when I join to more than 4 tables. Has anyone else experienced anything similar or got any suggestions as to what I might do? I am running on an intel box with two hyper threaded cores and 4GB of RAM. I have tweaked the postgres.conf files with these values and the query and explain output are below. In this case the query takes 6.037 ms to run on 862 and 2.332 to run on 743. Thanks in advance for any help. Regards Matthew 8.2.6 shared_buffers = 500MB work_mem = 10MB maintenance_work_mem = 100MB effective_cache_size = 2048MB default_statistics_target = 1000 7.4.3 shared_buffers = 51200 sort_mem = 10240 vacuum_mem = 81920 effective_cache_size = 102400 explain analyze SELECT * FROM market mrkt JOIN market_group_relation mgr USING (market_id) JOIN market_group mg USING (market_group_id) JOIN market_group_price_relation mgpr USING (market_group_id) JOIN accommodation_price_panel app ON app.accommodation_price_panel_id = mgpr.price_panel_id JOIN daily_rates dr USING (accommodation_price_panel_id) WHERE mrkt.live <> 'X'::bpchar AND mg.live <> 'X'::bpchar AND app.live <> 'X'::bpchar AND dr.min_group_size = 0 AND MARKET_ID = 10039 AND CODE = 'LONHRL' AND CODE_TYPE = 'IS' AND ROOM_TYPE = 'Zk' AND BOARD_TYPE = 'BB' AND CONTRACT_ID = '16077' AND ( START_DATE BETWEEN '2008-05-22' AND '2008-05-31' OR '2008-05-22' BETWEEN START_DATE AND END_DATE ) "Nested Loop (cost=37.27..48.34 rows=1 width=458) (actual time=1.474..2.138 rows=14 loops=1)" " -> Nested Loop (cost=37.27..42.34 rows=1 width=282) (actual time=1.428..1.640 rows=2 loops=1)" " -> Hash Join (cost=37.27..40.68 rows=1 width=199) (actual time=1.367..1.516 rows=2 loops=1)" " Hash Cond: ("outer".market_group_id = "inner".market_group_id)" " -> Seq Scan on market_group mg (cost=0.00..3.01 rows=78 width=81) (actual time=0.004..0.105 rows=80 loops=1)" " Filter: (live <> 'X'::bpchar)" " -> Hash (cost=37.27..37.27 rows=1 width=126) (actual time=1.325..1.325 rows=0 loops=1)" " -> Hash Join (cost=12.66..37.27 rows=1 width=126) (actual time=1.051..1.321 rows=2 loops=1)" " Hash Cond: ("outer".market_group_id = "inner".market_group_id)" " -> Seq Scan on market_group_relation mgr (cost=0.00..24.46 rows=27 width=31) (actual time=0.165..0.641 rows=30 loops=1)" " Filter: (10039 = market_id)" " -> Hash (cost=12.66..12.66 rows=2 width=95) (actual time=0.641..0.641 rows=0 loops=1)" " -> Nested Loop (cost=0.00..12.66 rows=2 width=95) (actual time=0.056..0.593 rows=27 loops=1)" " -> Index Scan using accommodation_price_panel_idx1 on accommodation_price_panel app (cost=0.00..6.02 rows=1 width=60) (actual time=0.037..0.200 rows=27 loops=1)" " Index Cond: ((contract_id = 16077) AND ((code)::text = 'LONHRL'::text) AND (code_type = 'IS'::bpchar))" " Filter: (live <> 'X'::bpchar)" " -> Index Scan using market_group_price_relation_pkey on market_group_price_relation mgpr (cost=0.00..6.62 rows=1 width=35) (actual time=0.007..0.008 rows=1 loops=27)" " Index Cond: ("outer".accommodation_price_panel_id = mgpr.price_panel_id)" " -> Seq Scan on market mrkt (cost=0.00..1.65 rows=1 width=87) (actual time=0.045..0.046 rows=1 loops=2)" " Filter: ((live <> 'X'::bpchar) AND (market_id = 10039))" " -> Index Scan using daily_rates_pkey on daily_rates dr (cost=0.00..5.99 rows=1 width=180) (actual time=0.022..0.113 rows=7 loops=2)" " Index Cond: ((dr.accommodation_price_panel_id = "outer".price_panel_id) AND (dr.room_type = 'Zk'::bpchar))" " Filter: ((min_group_size = 0) AND (board_type = 'BB'::bpchar) AND (('2008-05-22'::date >= start_date) OR (start_date >= '2008-05-22'::date)) AND (('2008-05-22'::date <= end_date) OR (start_date >= '2008-05-22'::date)) AND (('2008-05-22'::date >= st (..)" "Total runtime: 2.332 ms" "Nested Loop (cost=0.00..30.39 rows=1 width=458) (actual time=0.123..5.841 rows=14 loops=1)" " -> Nested Loop (cost=0.00..29.70 rows=1 width=439) (actual time=0.099..4.590 rows=189 loops=1)" " -> Nested Loop (cost=0.00..29.40 rows=1 width=358) (actual time=0.091..3.243 rows=189 loops=1)" " -> Nested Loop (cost=0.00..21.07 rows=1 width=327) (actual time=0.081..1.571 rows=189 loops=1)" " -> Nested Loop (cost=0.00..10.40 rows=1 width=147) (actual time=0.053..0.134 rows=27 loops=1)" " -> Seq Scan on market mrkt (cost=0.00..2.08 rows=1 width=87) (actual time=0.022..0.023 rows=1 loops=1)" " Filter: ((live <> 'X'::bpchar) AND (market_id = 10039))" " -> Index Scan using accommodation_price_panel_idx1 on accommodation_price_panel app (cost=0.00..8.31 rows=1 width=60) (actual time=0.027..0.071 rows=27 loops=1)" " Index Cond: ((contract_id = 16077) AND ((code)::text = 'LONHRL'::text) AND (code_type = 'IS'::bpchar))" " Filter: (live <> 'X'::bpchar)" " -> Index Scan using daily_rates_pkey on daily_rates dr (cost=0.00..10.64 rows=1 width=180) (actual time=0.019..0.038 rows=7 loops=27)" " Index Cond: ((app.accommodation_price_panel_id = dr.accommodation_price_panel_id) AND (dr.room_type = 'Zk'::bpchar) AND (dr.board_type = 'BB'::bpchar) AND (dr.min_group_size = 0))" " Filter: (((start_date >= '2008-05-22'::date) AND (start_date <= '2008-05-31'::date)) OR (('2008-05-22'::date >= start_date) AND ('2008-05-22'::date <= end_date)))" " -> Index Scan using market_group_price_relation_pkey on market_group_price_relation mgpr (cost=0.00..8.31 rows=1 width=35) (actual time=0.005..0.006 rows=1 loops=189)" " Index Cond: (app.accommodation_price_panel_id = mgpr.price_panel_id)" " -> Index Scan using market_group_pkey on market_group mg (cost=0.00..0.28 rows=1 width=81) (actual time=0.003..0.004 rows=1 loops=189)" " Index Cond: (mgpr.market_group_id = mg.market_group_id)" " Filter: (live <> 'X'::bpchar)" " -> Index Scan using market_group_relation_idx2 on market_group_relation mgr (cost=0.00..0.67 rows=1 width=31) (actual time=0.005..0.005 rows=0 loops=189)" " Index Cond: (mgr.market_group_id = mg.market_group_id)" " Filter: (10039 = market_id)" "Total runtime: 6.037 ms"
"Matthew Lunnon" <mlunnon@rwa-net.co.uk> writes: > In this case the query takes 6.037 ms to run on 862 and 2.332 to run on 743. The difference between 2ms and 6ms is pretty negligable. A single context switch or disk cache miss could throw the results off by that margin in either direction. But what plan does 7.4.3 come up with if you set enable_hashjoins = off? I'm curious whether it comes up with the same nested loops plan as 8.2 and what cost it says it has. I think you need to find queries which take longer to have any reliable performance comparisons. Note that the configuration parameters here aren't the same at all, it's possible the change of effective_cache_size from 800k to 2GB is what's changing the cost estimation. I seem to recall a change in the arithmetic for calculatin Nested loop costs too which made it more aggressive in estimating cache effectiveness. Incidentally, default_statistics_target=1000 is awfully aggressive. I found in the past that that caused the statistics table to become much larger and much slower to access. It may have caused some statistics to be toasted or it may have just been the sheer volume of data present. It will also make your ANALYZEs take a lot longer. I would suggest trying 100 first and incrementally raising it rather than jumping straight to 1000. And preferably only on the columns which really matter. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
On Jan 28, 2008 5:41 AM, Matthew Lunnon <mlunnon@rwa-net.co.uk> wrote: > Hi > > I am investigating migrating from postgres 743 to postgres 826 but > although the performance in postgres 826 seems to be generally better > there are some instances where it seems to be markedly worse, a factor > of up to 10. The problem seems to occur when I join to more than 4 > tables. Has anyone else experienced anything similar or got any > suggestions as to what I might do? I am running on an intel box with two > hyper threaded cores and 4GB of RAM. I have tweaked the postgres.conf > files with these values and the query and explain output are below. In > this case the query takes 6.037 ms to run on 862 and 2.332 to run on 743. It looks like the data are not the same in these two environments. > 8.2.6 > shared_buffers = 500MB > work_mem = 10MB > maintenance_work_mem = 100MB > effective_cache_size = 2048MB > default_statistics_target = 1000 That's very high for the default. Planning times will be increased noticeably Plan for 7.4: > "Nested Loop (cost=37.27..48.34 rows=1 width=458) (actual > time=1.474..2.138 rows=14 loops=1)" > " -> Nested Loop (cost=37.27..42.34 rows=1 width=282) (actual > time=1.428..1.640 rows=2 loops=1)" This is processing 2 rows... > "Total runtime: 2.332 ms" While this is processing 189 rows: > "Nested Loop (cost=0.00..30.39 rows=1 width=458) (actual > time=0.123..5.841 rows=14 loops=1)" > " -> Nested Loop (cost=0.00..29.70 rows=1 width=439) (actual > time=0.099..4.590 rows=189 loops=1)" Hardly seems a fair comparison.
Hi Scott,
Thanks for your time
Regards
Matthew
Scott Marlowe wrote:
Thanks for your time
Regards
Matthew
Scott Marlowe wrote:
I had originally left the default_statistics_target at its default and then increased it to 100, but this did not seem to make much difference. I will reduce this down to something more normal again.On Jan 28, 2008 5:41 AM, Matthew Lunnon <mlunnon@rwa-net.co.uk> wrote:Hi I am investigating migrating from postgres 743 to postgres 826 but although the performance in postgres 826 seems to be generally better there are some instances where it seems to be markedly worse, a factor of up to 10. The problem seems to occur when I join to more than 4 tables. Has anyone else experienced anything similar or got any suggestions as to what I might do? I am running on an intel box with two hyper threaded cores and 4GB of RAM. I have tweaked the postgres.conf files with these values and the query and explain output are below. In this case the query takes 6.037 ms to run on 862 and 2.332 to run on 743.It looks like the data are not the same in these two environments.8.2.6 shared_buffers = 500MB work_mem = 10MB maintenance_work_mem = 100MB effective_cache_size = 2048MB default_statistics_target = 1000That's very high for the default. Planning times will be increased noticeably
The queries were on exactly the same data. My interpretation of what is going on here is that 8.2.6 seems to be leaving the filtering of market_id to the very last point, which is why it ends up with 189 rows at this point instead of the 2 that 743 has. 743 seems to do that filtering much earlier and so reduce the number of rows at a much earlier point in the execution of the query. I guess that this is something to do with the planner which is why I tried increasing the default_statistics_target.Plan for 7.4:"Nested Loop (cost=37.27..48.34 rows=1 width=458) (actual time=1.474..2.138 rows=14 loops=1)" " -> Nested Loop (cost=37.27..42.34 rows=1 width=282) (actual time=1.428..1.640 rows=2 loops=1)"This is processing 2 rows..."Total runtime: 2.332 ms"While this is processing 189 rows:"Nested Loop (cost=0.00..30.39 rows=1 width=458) (actual time=0.123..5.841 rows=14 loops=1)" " -> Nested Loop (cost=0.00..29.70 rows=1 width=439) (actual time=0.099..4.590 rows=189 loops=1)"Hardly seems a fair comparison.
_____________________________________________________________________ This e-mail has been scanned for viruses by Verizon Business Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.verizonbusiness.com/uk
Whatever email agent you're using seems to be quoting in a way that doesn't get along well with gmail, so I'm just gonna chop most of it rather than have it quoted confusingly... Heck, I woulda chopped a lot anyway to keep it small. :) On Jan 28, 2008 9:27 AM, Matthew Lunnon <mlunnon@rwa-net.co.uk> wrote: > > Scott Marlowe wrote: > On Jan 28, 2008 5:41 AM, Matthew Lunnon <mlunnon@rwa-net.co.uk> wrote: > default_statistics_target = 1000 > > That's very high for the default. Planning times will be increased > > noticeably > > I had originally left the default_statistics_target at its default and then > increased it to 100, but this did not seem to make much difference. I will > reduce this down to something more normal again. You do know that if you create a column when the default is 10, then increase the default, it won't change the column's stats target, right? So, assuming the table was first created, then you changed the default, you'll now need to do: alter table xyz alter column abc set statistics 100; analyze xyz; for it to make any difference. > The queries were on exactly the same data. My interpretation of what is > going on here is that 8.2.6 seems to be leaving the filtering of market_id > to the very last point, which is why it ends up with 189 rows at this point > instead of the 2 that 743 has. 743 seems to do that filtering much earlier > and so reduce the number of rows at a much earlier point in the execution of > the query. I guess that this is something to do with the planner which is > why I tried increasing the default_statistics_target. Ahh, I'm guessing it's something that your 7.4 database CAN use an index on and your 8.2 data base can't use an index on. Like text in a non-C locale. Or something... Table def?
Hi Gregory/All,
Thanks for your time.
Yes the difference is pretty small but does seem to be consistent, the problem that I have is that this is just part of the query, I have tried to break things down so that I can see where the time is being spent. I set the default_statistics_target to 1000 after going via 100 but it seemed to make no difference.
I have a confession to make though, this is not like for like. I did in fact have to add a couple of indexes to the data as the performance was so bad with 8.2.6. Very sorry for that, it doesn't help. The actual difference if from 2ms to 57ms when these indexes are removed which is much more significant. Here is the like for like comparison with 8.2.6, the indexes were added to the market_group_relation table since it is doing a seq scan at the very end.
"Nested Loop (cost=0.00..54.03 rows=1 width=458) (actual time=0.279..57.457 rows=14 loops=1)"
" Join Filter: (mgr.market_group_id = mgpr.market_group_id)"
" -> Nested Loop (cost=0.00..29.19 rows=1 width=439) (actual time=0.102..4.867 rows=189 loops=1)"
" -> Nested Loop (cost=0.00..28.91 rows=1 width=358) (actual time=0.095..3.441 rows=189 loops=1)"
" -> Nested Loop (cost=0.00..20.60 rows=1 width=327) (actual time=0.082..1.639 rows=189 loops=1)"
" -> Nested Loop (cost=0.00..9.95 rows=1 width=147) (actual time=0.054..0.138 rows=27 loops=1)"
" -> Seq Scan on market mrkt (cost=0.00..1.65 rows=1 width=87) (actual time=0.020..0.020 rows=1 loops=1)"
" Filter: ((live <> 'X'::bpchar) AND (market_id = 10039))"
" -> Index Scan using accommodation_price_panel_idx1 on accommodation_price_panel app (cost=0.00..8.30 rows=1 width=60) (actual time=0.029..0.079 rows=27 loops=1)"
" Index Cond: ((contract_id = 16077) AND ((code)::text = 'LONHRL'::text) AND (code_type = 'IS'::bpchar))"
" Filter: (live <> 'X'::bpchar)"
" -> Index Scan using daily_rates_pkey on daily_rates dr (cost=0.00..10.63 rows=1 width=180) (actual time=0.021..0.041 rows=7 loops=27)"
" Index Cond: ((app.accommodation_price_panel_id = dr.accommodation_price_panel_id) AND (dr.room_type = 'Zk'::bpchar) AND (dr.board_type = 'BB'::bpchar) AND (dr.min_group_size = 0))"
" Filter: (((start_date >= '2008-05-22'::date) AND (start_date <= '2008-05-31'::date)) OR (('2008-05-22'::date >= start_date) AND ('2008-05-22'::date <= end_date)))"
" -> Index Scan using market_group_price_relation_pkey on market_group_price_relation mgpr (cost=0.00..8.30 rows=1 width=35) (actual time=0.005..0.006 rows=1 loops=189)"
" Index Cond: (app.accommodation_price_panel_id = mgpr.price_panel_id)"
" -> Index Scan using market_group_pkey on market_group mg (cost=0.00..0.27 rows=1 width=81) (actual time=0.003..0.004 rows=1 loops=189)"
" Index Cond: (mgpr.market_group_id = mg.market_group_id)"
" Filter: (live <> 'X'::bpchar)"
" -> Seq Scan on market_group_relation mgr (cost=0.00..24.46 rows=30 width=31) (actual time=0.068..0.259 rows=30 loops=189)"
" Filter: (10039 = market_id)"
"Total runtime: 57.648 ms"
Gregory Stark wrote:
Thanks for your time.
Yes the difference is pretty small but does seem to be consistent, the problem that I have is that this is just part of the query, I have tried to break things down so that I can see where the time is being spent. I set the default_statistics_target to 1000 after going via 100 but it seemed to make no difference.
I have a confession to make though, this is not like for like. I did in fact have to add a couple of indexes to the data as the performance was so bad with 8.2.6. Very sorry for that, it doesn't help. The actual difference if from 2ms to 57ms when these indexes are removed which is much more significant. Here is the like for like comparison with 8.2.6, the indexes were added to the market_group_relation table since it is doing a seq scan at the very end.
"Nested Loop (cost=0.00..54.03 rows=1 width=458) (actual time=0.279..57.457 rows=14 loops=1)"
" Join Filter: (mgr.market_group_id = mgpr.market_group_id)"
" -> Nested Loop (cost=0.00..29.19 rows=1 width=439) (actual time=0.102..4.867 rows=189 loops=1)"
" -> Nested Loop (cost=0.00..28.91 rows=1 width=358) (actual time=0.095..3.441 rows=189 loops=1)"
" -> Nested Loop (cost=0.00..20.60 rows=1 width=327) (actual time=0.082..1.639 rows=189 loops=1)"
" -> Nested Loop (cost=0.00..9.95 rows=1 width=147) (actual time=0.054..0.138 rows=27 loops=1)"
" -> Seq Scan on market mrkt (cost=0.00..1.65 rows=1 width=87) (actual time=0.020..0.020 rows=1 loops=1)"
" Filter: ((live <> 'X'::bpchar) AND (market_id = 10039))"
" -> Index Scan using accommodation_price_panel_idx1 on accommodation_price_panel app (cost=0.00..8.30 rows=1 width=60) (actual time=0.029..0.079 rows=27 loops=1)"
" Index Cond: ((contract_id = 16077) AND ((code)::text = 'LONHRL'::text) AND (code_type = 'IS'::bpchar))"
" Filter: (live <> 'X'::bpchar)"
" -> Index Scan using daily_rates_pkey on daily_rates dr (cost=0.00..10.63 rows=1 width=180) (actual time=0.021..0.041 rows=7 loops=27)"
" Index Cond: ((app.accommodation_price_panel_id = dr.accommodation_price_panel_id) AND (dr.room_type = 'Zk'::bpchar) AND (dr.board_type = 'BB'::bpchar) AND (dr.min_group_size = 0))"
" Filter: (((start_date >= '2008-05-22'::date) AND (start_date <= '2008-05-31'::date)) OR (('2008-05-22'::date >= start_date) AND ('2008-05-22'::date <= end_date)))"
" -> Index Scan using market_group_price_relation_pkey on market_group_price_relation mgpr (cost=0.00..8.30 rows=1 width=35) (actual time=0.005..0.006 rows=1 loops=189)"
" Index Cond: (app.accommodation_price_panel_id = mgpr.price_panel_id)"
" -> Index Scan using market_group_pkey on market_group mg (cost=0.00..0.27 rows=1 width=81) (actual time=0.003..0.004 rows=1 loops=189)"
" Index Cond: (mgpr.market_group_id = mg.market_group_id)"
" Filter: (live <> 'X'::bpchar)"
" -> Seq Scan on market_group_relation mgr (cost=0.00..24.46 rows=30 width=31) (actual time=0.068..0.259 rows=30 loops=189)"
" Filter: (10039 = market_id)"
"Total runtime: 57.648 ms"
Gregory Stark wrote:
I'll investigate and let you know."Matthew Lunnon" <mlunnon@rwa-net.co.uk> writes:In this case the query takes 6.037 ms to run on 862 and 2.332 to run on 743.The difference between 2ms and 6ms is pretty negligable. A single context switch or disk cache miss could throw the results off by that margin in either direction. But what plan does 7.4.3 come up with if you set enable_hashjoins = off? I'm curious whether it comes up with the same nested loops plan as 8.2 and what cost it says it has.
I think you need to find queries which take longer to have any reliable performance comparisons. Note that the configuration parameters here aren't the same at all, it's possible the change of effective_cache_size from 800k to 2GB is what's changing the cost estimation. I seem to recall a change in the arithmetic for calculatin Nested loop costs too which made it more aggressive in estimating cache effectiveness. Incidentally, default_statistics_target=1000 is awfully aggressive. I found in the past that that caused the statistics table to become much larger and much slower to access. It may have caused some statistics to be toasted or it may have just been the sheer volume of data present. It will also make your ANALYZEs take a lot longer. I would suggest trying 100 first and incrementally raising it rather than jumping straight to 1000. And preferably only on the columns which really matter.
-- Matthew Lunnon Technical Consultant RWA Ltd. mlunnon@rwa-net.co.ukTel: +44 (0)29 2081 5056www.rwa-net.co.uk --
Scott Marlowe wrote:
Thanks again for your time. I'm using Thunderbird, maybe I need to upgrade.Whatever email agent you're using seems to be quoting in a way that doesn't get along well with gmail, so I'm just gonna chop most of it rather than have it quoted confusingly... Heck, I woulda chopped a lot anyway to keep it small. :)
Thanks I haven't looked into this yet, I'll look. When I changed the default_stats_target it did take a very long time to do its analyze so I assumed it was doing something.On Jan 28, 2008 9:27 AM, Matthew Lunnon <mlunnon@rwa-net.co.uk> wrote:Scott Marlowe wrote:On Jan 28, 2008 5:41 AM, Matthew Lunnon <mlunnon@rwa-net.co.uk> wrote: default_statistics_target = 1000That's very high for the default. Planning times will be increased noticeablyI had originally left the default_statistics_target at its default and then increased it to 100, but this did not seem to make much difference. I will reduce this down to something more normal again.You do know that if you create a column when the default is 10, then increase the default, it won't change the column's stats target, right? So, assuming the table was first created, then you changed the default, you'll now need to do: alter table xyz alter column abc set statistics 100; analyze xyz; for it to make any difference.
Thanks, I'll take a look at that, is there any documentation on what 8.2.6. can't use in an index? It didn't seem to have complained about any of my indexes when I generated the database.The queries were on exactly the same data. My interpretation of what is going on here is that 8.2.6 seems to be leaving the filtering of market_id to the very last point, which is why it ends up with 189 rows at this point instead of the 2 that 743 has. 743 seems to do that filtering much earlier and so reduce the number of rows at a much earlier point in the execution of the query. I guess that this is something to do with the planner which is why I tried increasing the default_statistics_target.Ahh, I'm guessing it's something that your 7.4 database CAN use an index on and your 8.2 data base can't use an index on. Like text in a non-C locale. Or something... Table def?
---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster _____________________________________________________________________ This e-mail has been scanned for viruses by Verizon Business Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.verizonbusiness.com/uk
-- Matthew Lunnon Technical Consultant RWA Ltd. mlunnon@rwa-net.co.ukTel: +44 (0)29 2081 5056www.rwa-net.co.uk --