Re: PGSQL 7.4 -> 8.1 migration & performance problem - Mailing list pgsql-general
From | Peter Kovacs |
---|---|
Subject | Re: PGSQL 7.4 -> 8.1 migration & performance problem |
Date | |
Msg-id | 447E98F0.50105@chemaxon.hu Whole thread Raw |
In response to | Re: PGSQL 7.4 -> 8.1 migration & performance problem ("Philippe Lang" <philippe.lang@attiksystem.ch>) |
List | pgsql-general |
About three years ago I did a test to compare the performance of an Apache-PHP-PostgreSQL web application on Linux vs. FreeBSD. I used the same machine with the then current versions of the said software. The results were better by 30-35% on FreeBSD. Since then I have been having the sneaking feeling that FreeBSD generally performs better than Linux. (Actually, my personal experience had always been that FreeBSD was much more responsive on very slow machines (Intel 486) when used interactively. But this could have been a result of the rougher scheduler in earlier Linuxes and despite better responsiveness, overall performance could still have been better with Linux.) Peter Philippe Lang wrote: > Hi Tom, hi Ludwig, > > Thanks for your support. Yes, this query has grown very big with time, and I was always asked to add exceptions in it,so the result is quite frightening! > > > >>>> TOM: If you try setting enable_bitmapscan off, >>>> you'll probably find 8.1 beating 7.4 handily for this query. >>>> > > Correct. I had 239 seconds on the old 7.4 server, and I get 189 seconds with "enable_bitmapscan = off". > > > >>>> LUDWIG: What about adding an index to the field >>>> etapes_lignes_commandes(code_etape) >>>> > > I have this index already. > > > >>>> LUDWIG: What about increasing the settings of work_mem? >>>> > > default work_mem = 1024 ==> 511 seconds > work_mem = 2048 ==> 531 seconds > > > >>>> TOM: Am I right in guessing that your database is small >>>> enough to fit into RAM on the new server? >>>> > > select pg_database_size('groupefpdb'); > > returns "360428168" > > That's 360 MB. It fits in RAM, correct! > > >>>> TOM: If so, it would be reasonable to reduce random_page_cost, >>>> perhaps all the way to 1.0, and this would probably improve >>>> the quality of the planner's choices for you. >>>> > > With that change I get results in 193 seconds. > > > >>>> TOM: What might work better is to get rid of the indexes >>>> w_code_type_workflow and lw_ordre --- do you have any >>>> queries where those are actually useful? >>>> > > Yes, I think I do, but let's try: > > Drop both indexes > VACUUM FREEZE ANALYZE > ANALYZE > > I get the results in 199 seconds. > > > >>>> TOM: Another thing you should look at is increasing the >>>> cpu-cost parameters. The numbers in your EXPLAIN ANALYZE >>>> results suggest that on your new machine the cost of >>>> processing an index tuple is about 1/50th of the cost of >>>> touching an index page; that is, you ought to have >>>> cpu_index_tuple_cost plus cpu_operator_cost around 0.02. >>>> I'd try setting each of them to 0.01 and increasing >>>> cpu_tuple_cost a little bit, maybe to 0.02. >>>> > > cpu_index_tuple_cost = 0.01 > cpu_operator_cost = 0.01 > cpu_tuple_cost = 0.02 > > With these change, plus random_page_cost = 1, I get results in 195 seconds. > > (Yes, I did restart the server!) > > The new EXPLAIN ANALYSE at this point is here: > > Postgresql 8.1.4 with tuning: http://www.attiksystem.ch/explain_analyze_81_2.txt > > The old EXPLAIN ANALYZE are still here: > > Postgresql 7.4.5: http://www.attiksystem.ch/explain_analyze_74.txt > Postgresql 8.1.4 without tuning: http://www.attiksystem.ch/explain_analyze_81.txt > > > > > > > Is there maybe something I could tune further on the kernel side? I get only 20 % improvements with the new server withLinux, compared to the workstation with freebsd... Maybe my query is so CPU-bound that the most important thing is theCPU clock speed, and in both cases I have a single 3Ghz CPU to run the query? What do you think? > > > Philippe > > > -----Message d'origine----- > De : Tom Lane [mailto:tgl@sss.pgh.pa.us] > Envoyé : mercredi, 31. mai 2006 18:21 > À : Philippe Lang > Cc : pgsql-general@postgresql.org > Objet : Re: [GENERAL] PGSQL 7.4 -> 8.1 migration & performance problem > > "Philippe Lang" <philippe.lang@attiksystem.ch> writes: > >> Here are both EXPLAIN ANALYSE results, plus the query itself: >> Postgresql 7.4.5: http://www.attiksystem.ch/explain_analyze_74.txt >> Postgresql 8.1.4: http://www.attiksystem.ch/explain_analyze_81.txt >> Query is here: http://www.attiksystem.ch/big_query.txt >> > > My goodness, that is one big ugly query :-( > > Trawling through the EXPLAIN ANALYZE results, it seems that the lossage comes from the repeated occurrences of this pattern: > > SELECT travaux_en_cours_mvu FROM lignes_workflows AS lw > INNER JOIN workflows AS w > ON lw.id_workflow = w.id > WHERE w.code_article = lignes_commandes.code_article > AND w.code_type_workflow = commandes.code_type_workflow > AND SUBSTRING(lw.code_etape FROM 1 FOR 3) = SUBSTRING(etapes_lignes_commandes.code_etape FROM 1 FOR 3) > AND lw.ordre = etapes_lignes_commandes.ordre > > 7.4 is doing this as > > -> Nested Loop (cost=0.00..37.28 rows=1 width=8) (actual time=0.056..0.087 rows=1 loops=13653) > -> Index Scan using w_code_article on workflows w (cost=0.00..15.76 rows=1 width=4) (actual time=0.016..0.024rows=1 loops=13653) > Index Cond: (code_article = $1) > Filter: (code_type_workflow = $2) > -> Index Scan using lw_id_workflow on lignes_workflows lw (cost=0.00..21.51 rows=1 width=12) (actual time=0.023..0.036rows=1 loops=13651) > Index Cond: (lw.id_workflow = "outer".id) > Filter: (("substring"((code_etape)::text, 1, 3) = "substring"(($3)::text, 1, 3)) AND (ordre = $4)) > > 8.1 is doing > > -> Nested Loop (cost=18.93..26.84 rows=1 width=8) (actual time=0.431..0.434 rows=1 loops=13630) > -> Bitmap Heap Scan on workflows w (cost=6.63..10.51 rows=1 width=4) (actual time=0.107..0.107 rows=1 loops=13630) > Recheck Cond: ((code_article = $1) AND (code_type_workflow = $2)) > -> BitmapAnd (cost=6.63..6.63 rows=1 width=0) (actual time=0.104..0.104 rows=0 loops=13630) > -> Bitmap Index Scan on w_code_article (cost=0.00..2.02 rows=5 width=0) (actual time=0.017..0.017rows=5 loops=13630) > Index Cond: (code_article = $1) > -> Bitmap Index Scan on w_code_type_workflow (cost=0.00..4.36 rows=389 width=0) (actual time=0.083..0.083rows=390 loops=13628) > Index Cond: (code_type_workflow = $2) > -> Bitmap Heap Scan on lignes_workflows lw (cost=12.30..16.31 rows=1 width=12) (actual time=0.312..0.313 rows=1loops=13628) > Recheck Cond: ((lw.id_workflow = "outer".id) AND (lw.ordre = $4)) > Filter: ("substring"((code_etape)::text, 1, 3) = "substring"(($3)::text, 1, 3)) > -> BitmapAnd (cost=12.30..12.30 rows=1 width=0) (actual time=0.306..0.306 rows=0 loops=13628) > -> Bitmap Index Scan on lw_id_workflow (cost=0.00..2.02 rows=7 width=0) (actual time=0.009..0.009rows=7 loops=13628) > Index Cond: (lw.id_workflow = "outer".id) > -> Bitmap Index Scan on lw_ordre (cost=0.00..10.03 rows=1437 width=0) (actual time=0.293..0.293 rows=1714loops=13628) > Index Cond: (ordre = $4) > > In the parts of the plan that do not depend on workflows/lignes_workflows joins, 8.1 is consistently beating 7.4. > > So basically 8.1 is being too optimistic about the value of ANDing multiple indexes. If you try setting enable_bitmapscanoff, you'll probably find 8.1 beating 7.4 handily for this query. That's a really blunt-instrument solutionof course, and I wouldn't recommend it for production because it'll probably kill performance elsewhere. What mightwork better is to get rid of the indexes w_code_type_workflow and lw_ordre --- do you have any queries where those areactually useful? > > Meanwhile, I think I'm going to have to take another look at the bitmap cost estimates ... it shouldn't be bothering toAND a 7-row result with a 1437-row result, even if that does save six trips to the heap. > > regards, tom lane > >
pgsql-general by date: