Thread: Slow query after upgrade from 8.2 to 8.4
Hello All. We recently upgrade our server from PG8.2 to 8.4. Our current version is: database=> SELECT version(); version -------------------------------------------------------------------------------------------------------------------- PostgreSQL 8.4.8 on amd64-portbld-freebsd8.2, compiled by GCC cc (GCC) 4.2.2 20070831 prerelease [FreeBSD], 64-bit (1 row) I read and setup most of the tuning advices. Since the upgrade we have some very complicated reports that start working too slow. Form tens of seconds to 1 hours. I execute vacuum analyze before start the query. Here I will post explain analyze. If you think it is necessary I will post the exact query: http://explain.depesz.com/s/J0O I think the planner didn't choose the best plan. I will try to I rewrite the query and set join_collapse_limit to 1 and see what will happen. Meanwhile any suggestions are welcome. Best regards and thanks in advance for the help. Kaloyan Iliev
On 12/08/2011 11:29 AM, Kaloyan Iliev Iliev wrote: > I think the planner didn't choose the best plan. I will try to I rewrite > the query and set join_collapse_limit to 1 and see what will happen. > Meanwhile any suggestions are welcome. Try rewriting the query, definitely. But make sure your statistical targets are high enough for an analyze to make a difference. I see way too many nested loops with wrong row estimates. Like these: Nested Loop (cost=0.00..8675.62 rows=2263 width=4) (actual time=0.456..5991.749 rows=68752 loops=167) Join Filter: (dd.debtid = ad.debtid) Nested Loop (cost=0.00..7864.54 rows=1160 width=4) (actual time=0.374..2781.762 rows=34384 loops=167) Index Scan using config_office_idx on config cf (cost=0.00..7762.56 rows=50 width=8) (actual time=0.199..1623.366 rows=2460 loops=167) Index Cond: (office = 6) Filter: (id = (SubPlan 6)) There are several spots where the row estimates are off by one or two orders of magnitude. Instead of doing a sequence scan for such large tables, it's nest looping over an index scan, sometimes millions of times. And then you have these: Index Scan using config_confid_idx on config (cost=0.00..0.66 rows=6 width=12) (actual time=0.023..0.094 rows=10 loops=1655853) Index Cond: (confid = $3) Index Scan using debts_desc_refid_idx on debts_desc dd (cost=0.00..1.66 rows=30 width=8) (actual time=0.061..0.381 rows=14 loops=410867) Index Cond: (dd.refid = cf.confid) Index Scan using acc_debts_debtid_idx on acc_debts ad (cost=0.00..0.39 rows=2 width=8) (actual time=0.034..0.053 rows=2 loops=5742191) Index Cond: (ad.debtid = dd.debtid) Index Scan using acc_debtscl_debtid_idx on acc_debts_cleared ad (cost=0.00..0.27 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=5742183) Index Cond: (ad.debtid = dd.debtid) Having index scans that big embedded in nested loops is going to murder your CPU even if every table involved is cached in memory. I'm not surprised this takes an hour or more to run. Increase the statistics on these tables, and pay special attention to the debtid and refid columns, and then analyze them again. What's your default_statistics_target? -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Kaloyan Iliev Iliev <kaloyan@digsys.bg> writes: > We recently upgrade our server from PG8.2 to 8.4. > ... > Here I will post explain analyze. If you think it is necessary I will > post the exact query: > http://explain.depesz.com/s/J0O Yeah, you need to show the query. It looks like the performance problem is stemming from a lot of subselects, but it's not clear why 8.4 would be handling those worse than 8.2. regards, tom lane
<tt>Hi,<br /> Actually I think the problem is with this sub query:<br /> explain analyze select 1<br /> from acc_clients AC,<br /> acc_debts AD,<br/> debts_desc DD,<br /> configCF<br /> where AC.ino = 1200000 AND<br /><br /> CF.id = (select id<br /> from config<br /> where confid=CF.confid ORDER BY archived_at DESC LIMIT 1) AND<br /> AD.transact_no = AC.transact_no AND<br /> AD.debtid = DD.debtid AND<br /> CF.office = 18 AND<br /> DD.refid= CF.confid LIMIT 1;</tt><br /><br /> Instead of starting from '<tt>AC.ino = 1200000' and limit the rows IT startwith '</tt><tt>CF.office = 18' which returns much more rows:<br /> SO: This is the query plan of the upper query.<br/><br /><a class="moz-txt-link-freetext" href="http://explain.depesz.com/s/ATN">http://explain.depesz.com/s/ATN</a><br/><br /><br /> If I remove the condition </tt><tt>'</tt><tt>CF.office= 18' the planner chose the correct plan and result is fast.</tt><br /><tt>explain analyze select1<br /> from acc_clients AC,<br /> acc_debts AD,<br /> debts_desc DD,<br/> config CF<br /> where AC.ino= 1200000 AND<br /><br /> CF.id = (select id<br /> from config<br /> where confid=CF.confid ORDER BY archived_at DESC LIMIT 1) AND<br /> AD.transact_no = AC.transact_no AND<br /> AD.debtid = DD.debtid AND<br /> DD.refid = CF.confid LIMIT 1;<br /></tt><tt><br /><a class="moz-txt-link-freetext"href="http://explain.depesz.com/s/4zb">http://explain.depesz.com/s/4zb</a></tt><br /><br />I want this plan and this query but with the additional condition<tt> '</tt><tt>CF.office = 18'.<br /> How could I forcethe planner to use this plan and just filter the result.<br /><br /> Best regards,<br /> Kaloyan Iliev<br /></tt><br/><br /> Tom Lane wrote: <blockquote cite="mid:14557.1323401295@sss.pgh.pa.us" type="cite"><pre wrap="">KaloyanIliev Iliev <a class="moz-txt-link-rfc2396E" href="mailto:kaloyan@digsys.bg"><kaloyan@digsys.bg></a>writes: </pre><blockquote type="cite"><pre wrap="">We recentlyupgrade our server from PG8.2 to 8.4. ... Here I will post explain analyze. If you think it is necessary I will post the exact query: <a class="moz-txt-link-freetext" href="http://explain.depesz.com/s/J0O">http://explain.depesz.com/s/J0O</a> </pre></blockquote><prewrap=""> Yeah, you need to show the query. It looks like the performance problem is stemming from a lot of subselects, but it's not clear why 8.4 would be handling those worse than 8.2. regards, tom lane </pre></blockquote>
Kaloyan Iliev Iliev <kaloyan@digsys.bg> writes: > <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> > <html> > <head> > <meta content="text/html;charset=windows-1251" > http-equiv="Content-Type"> > </head> > <body bgcolor="#ffffff" text="#000000"> > <tt>Hi,<br> > Actually I think the problem is with this sub query:<br> > explain analyze select 1<br> > from acc_clients AC,<br> > acc_debts AD,<br> > debts_desc DD,<br> > config CF<br> > where AC.ino = 1200000 AND<br> > <br> > CF.id = (select id<br> > from config<br> > where > confid=CF.confid ORDER BY archived_at DESC LIMIT 1) AND<br> > AD.transact_no = > AC.transact_no AND<br> > AD.debtid = DD.debtid AND<br> > CF.office = 18 AND<br> > DD.refid = CF.confid LIMIT > 1;</tt><br> > <br> > Instead of starting from '<tt>AC.ino = 1200000' and limit the rows IT > start with '</tt><tt>CF.office = 18' which returns much more rows:<br> Please don't post HTML mail. I think the real issue is that you've got an astonishingly expensive approach to keeping obsolete "config" rows around. You should get rid of that "ORDER BY archived_at" sub-select, either by not storing obsolete rows at all (you could move them to a history table instead), or by marking valid rows with a boolean flag. However, it's not apparent to me why you would see any difference between 8.2 and 8.4 on this type of query. I tried a query analogous to this one on both, and got identical plans. I'm guessing that your slowdown is due to not having updated statistics on the new installation, or perhaps failing to duplicate some relevant settings. regards, tom lane
On 10/12/11 04:30, Tom Lane wrote: > However, it's not apparent to me why you would see any difference > between 8.2 and 8.4 on this type of query. I tried a query analogous > to this one on both, and got identical plans. I'm guessing that your > slowdown is due to not having updated statistics on the new > installation, or perhaps failing to duplicate some relevant settings. I notice he has 8.4.*8*... I wonder if he's running into the poor estimation bug for sub-selects/semi joins that was fixed in 8.4.9. Kaloyan, can you try the query in 8.4.9? regards Mark
Hi, Thanks for Replay. Actually I finally find a solution. If I rewrite the query in this way: explain analyze select 1 from acc_clients AC, acc_debts AD, debts_desc DD, config CF where AC.ino = 204627 AND CF.id = (select id from config where confid=CF.confid AND office = 18 ORDER BY archived_at DESC LIMIT 1) AND AD.transact_no = AC.transact_no AND AD.debtid = DD.debtid AND DD.refid = CF.confid LIMIT 1; the plan and execution time really approves. http://explain.depesz.com/s/Nkj And for comparison I will repost the old way the query was written. explain analyze select 1 from acc_clients AC, acc_debts AD, debts_desc DD, config CF where AC.ino = 1200000 AND CF.id = (select id from config where confid=CF.confid ORDER BY archived_at DESC LIMIT 1) AND AD.transact_no = AC.transact_no AND AD.debtid = DD.debtid AND CF.office = 18 AND DD.refid = CF.confid LIMIT 1; This is the query plan of the upper query. http://explain.depesz.com/s/ATN When we have 8.4.9 installed I will try the query and post the result. Best regards, Kaloyan Iliev Mark Kirkwood wrote: > On 10/12/11 04:30, Tom Lane wrote: >> However, it's not apparent to me why you would see any difference >> between 8.2 and 8.4 on this type of query. I tried a query analogous >> to this one on both, and got identical plans. I'm guessing that your >> slowdown is due to not having updated statistics on the new >> installation, or perhaps failing to duplicate some relevant settings. > > I notice he has 8.4.*8*... I wonder if he's running into the poor > estimation bug for sub-selects/semi joins that was fixed in 8.4.9. > > Kaloyan, can you try the query in 8.4.9? > > regards > > Mark >