Re: checkpoint patches - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: checkpoint patches |
Date | |
Msg-id | CA+TgmobuoY6Sc_WQbv=SZYkGC+yWejR7Bdyra3kbwL=sVAbxvA@mail.gmail.com Whole thread Raw |
In response to | Re: checkpoint patches (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: checkpoint patches
Re: checkpoint patches |
List | pgsql-hackers |
On Thu, Mar 22, 2012 at 8:44 PM, Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Thu, Mar 22, 2012 at 3:45 PM, Stephen Frost <sfrost@snowman.net> wrote: >> > Well, those numbers just aren't that exciting. :/ >> >> Agreed. There's clearly an effect, but on this test it's not very big. > > Ok, perhaps that was because of how you were analyzing it using the 90th > percetile..? Well, how do you want to look at it? Here's the data from 80th percentile through 100th percentile - percentile, patched, unpatched, difference - for the same two runs I've been comparing: 80 1321 1348 -27 81 1333 1360 -27 82 1345 1373 -28 83 1359 1387 -28 84 1373 1401 -28 85 1388 1417 -29 86 1404 1434 -30 87 1422 1452 -30 88 1441 1472 -31 89 1462 1494 -32 90 1487 1519 -32 91 1514 1548 -34 92 1547 1582 -35 93 1586 1625 -39 94 1637 1681 -44 95 1709 1762 -53 96 1825 1905 -80 97 2106 2288 -182 98 12100 24645 -12545 99 186043 201309 -15266 100 9513855 9074161 439694 Here are the 95th-100th percentiles for each of the six runs: ckpt.checkpoint-sync-pause-v1.10: 1709, 1825, 2106, 12100, 186043, 9513855 ckpt.checkpoint-sync-pause-v1.11: 1707, 1824, 2118, 16792, 196107, 8869602 ckpt.checkpoint-sync-pause-v1.12: 1693, 1807, 2091, 15132, 191207, 7246326 ckpt.master.10: 1734, 1875, 2235, 21145, 203214, 6855888 ckpt.master.11: 1762, 1905, 2288, 24645, 201309, 9074161 ckpt.master.12: 1746, 1889, 2272, 20309, 194459, 7833582 By the way, I reran the tests on master with checkpoint_timeout=16min, and here are the tps results: 2492.966759, 2588.750631, 2575.175993. So it seems like not all of the tps gain from this patch comes from the fact that it increases the time between checkpoints. Comparing the median of three results between the different sets of runs, applying the patch and setting a 3s delay between syncs gives you about a 5.8% increase throughput, but also adds 30-40 seconds between checkpoints. If you don't apply the patch but do increase time between checkpoints by 1 minute, you get about a 5.0% increase in throughput. That certainly means that the patch is doing something - because 5.8% for 30-40 seconds is better than 5.0% for 60 seconds - but it's a pretty small effect. And here are the latency results for 95th-100th percentile with checkpoint_timeout=16min. ckpt.master.13: 1703, 1830, 2166, 17953, 192434, 43946669 ckpt.master.14: 1728, 1858, 2169, 15596, 187943, 9619191 ckpt.master.15: 1700, 1835, 2189, 22181, 206445, 8212125 The picture looks similar here. Increasing checkpoint_timeout isn't *quite* as good as spreading out the fsyncs, but it's pretty darn close. For example, looking at the median of the three 98th percentile numbers for each configuration, the patch bought us a 28% improvement in 98th percentile latency. But increasing checkpoint_timeout by a minute bought us a 15% improvement in 98th percentile latency. So it's still not clear to me that the patch is doing anything on this test that you couldn't get just by increasing checkpoint_timeout by a few more minutes. Granted, it lets you keep your inter-checkpoint interval slightly smaller, but that's not that exciting. That having been said, I don't have a whole lot of trouble believing that there are other cases where this is more worthwhile. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: