Thread: The shared buffers challenge
Hello performers, I've long been unhappy with the standard advice given for setting shared buffers. This includes the stupendously vague comments in the standard documentation, which suggest certain settings in order to get 'good performance'. Performance of what? Connection negotiation speed? Not that it's wrong necessarily, but ISTM too much based on speculative or anecdotal information. I'd like to see the lore around this setting clarified, especially so we can refine advice to: 'if you are seeing symptoms x,y,z set shared_buffers from a to b to get symptom reduction of k'. I've never seen a database blow up from setting them too low, but over the years I've helped several people with bad i/o situations or outright OOM conditions from setting them too high. My general understanding of shared_buffers is that they are a little bit faster than filesystem buffering (everything these days is ultimately based on mmap AIUI, so there's no reason to suspect anything else). Where they are most helpful is for masking of i/o if a page gets dirtied >1 times before it's written out to the heap, but seeing any benefit from that at all is going to be very workload dependent. There are also downsides using them instead of on the heap as well, and the amount of buffers you have influences checkpoint behavior. So things are complex. So, the challenge is this: I'd like to see repeatable test cases that demonstrate regular performance gains > 20%. Double bonus points for cases that show gains > 50%. No points given for anecdotal or unverifiable data. Not only will this help raise the body of knowledge regarding the setting, but it will help produce benchmarking metrics against which we can measure multiple interesting buffer related patches in the pipeline. Anybody up for it? merlin
Merlin Moncure <mmoncure@gmail.com> wrote: > So, the challenge is this: I'd like to see repeatable test cases > that demonstrate regular performance gains > 20%. Double bonus > points for cases that show gains > 50%. Are you talking throughput, maximum latency, or some other metric? In our shop the metric we tuned for in reducing shared_buffers was getting the number of "fast" queries (which normally run in under a millisecond) which would occasionally, in clusters, take over 20 seconds (and thus be canceled by our web app and present as errors to the public) down to zero. While I know there are those who care primarily about throughput numbers, that's worthless to me without maximum latency information under prolonged load. I'm not talking 90th percentile latency numbers, either -- if 10% of our web requests were timing out the villagers would be coming after us with pitchforks and torches. -Kevin
On Thu, May 26, 2011 at 10:10 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Merlin Moncure <mmoncure@gmail.com> wrote: > >> So, the challenge is this: I'd like to see repeatable test cases >> that demonstrate regular performance gains > 20%. Double bonus >> points for cases that show gains > 50%. > > Are you talking throughput, maximum latency, or some other metric? I am talking about *any* metric..you've got something, let's see it. But it's got to be verifiable, so no points scored. See my note above about symptoms -- if your symptom of note happens to be unpredictable spikes in fast query times under load, then I'd like to scribble that advice directly into the docs along with (hopefully) some reasoning of exactly why more database managed buffers are helping. As noted, I'm particularly interested in things we can test outside of production environments, since I'm pretty skeptical the Wisconsin Court System is going to allow the internet to log in and repeat and verify test methodologies. Point being: cranking buffers may have been the bee's knees with, say, the 8.2 buffer manager, but present and future improvements may have render that change moot or even counter productive. I doubt it's really changed much, but we really need to do better on this -- all else being equal, the lowest shared_buffers setting possible without sacrificing performance is best because it releases more memory to the o/s to be used for other things -- so "everthing's bigger in Texas" type approaches to postgresql.conf manipulation (not that I see that here of course) are not necessarily better :-). merlin
On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > Point being: cranking buffers > may have been the bee's knees with, say, the 8.2 buffer manager, but > present and future improvements may have render that change moot or > even counter productive. I suggest you read the docs on how shared buffers work, because, reasonably, it would be all the way around. Recent improvments into how postgres manage its shared buffer pool makes them better than the OS cache, so there should be more incentive to increase them, rather than decrease them. Workload conditions may make those improvements worthless, hinting that you should decrease them. But you have to know your workload and you have to know how the shared buffers work.
On Thu, May 26, 2011 at 10:45 AM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> Point being: cranking buffers >> may have been the bee's knees with, say, the 8.2 buffer manager, but >> present and future improvements may have render that change moot or >> even counter productive. > > I suggest you read the docs on how shared buffers work, because, > reasonably, it would be all the way around. > > Recent improvments into how postgres manage its shared buffer pool > makes them better than the OS cache, so there should be more incentive > to increase them, rather than decrease them. > > Workload conditions may make those improvements worthless, hinting > that you should decrease them. > > But you have to know your workload and you have to know how the shared > buffers work. I am not denying that any of those things are the case, although your assumption that I haven't read the documentation was obviously not grounded upon research. What you and I know/don't know is not the point. The point is what we can prove, because going through the motions of doing that is useful. You are also totally missing my other thrust, which is that future changes to how things work could change the dynamics of .conf configuration -- btw not for the first time in the history of the project. merlin
On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > The point is what we can prove, because going through the > motions of doing that is useful. Exactly, and whatever you can "prove" will be workload-dependant. So you can't prove anything "generally", since no single setting is best for all. > You are also totally missing my > other thrust, which is that future changes to how things work could > change the dynamics of .conf configuration Nope, I'm not missing it, simply not commenting on it.
On Thu, May 26, 2011 at 11:37 AM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> The point is what we can prove, because going through the >> motions of doing that is useful. > > Exactly, and whatever you can "prove" will be workload-dependant. > So you can't prove anything "generally", since no single setting is > best for all. Then we should stop telling people to adjust it unless we can match the workload to the improvement. There are some people here who can do that as if by magic, but that's not the issue. I'm trying to understand the why it works better for some than for others. What's frustrating is simply believing something is the case, without trying to understand why. How about, instead of arguing with me, coming up with something for the challenge? merlin
Merlin Moncure <mmoncure@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> Merlin Moncure <mmoncure@gmail.com> wrote: >> >>> So, the challenge is this: I'd like to see repeatable test cases >>> that demonstrate regular performance gains > 20%. Double bonus >>> points for cases that show gains > 50%. >> >> Are you talking throughput, maximum latency, or some other >> metric? > > I am talking about *any* metric..you've got something, let's see > it. But it's got to be verifiable, so no points scored. Oh, that wasn't to score points; just advocating for more than a one-dimensional view of performance. I'm adding to your demands, not attempting to satisfy them. :-) > See my note above about symptoms -- if your symptom of note > happens to be unpredictable spikes in fast query times under load, > then I'd like to scribble that advice directly into the docs along > with (hopefully) some reasoning of exactly why more database > managed buffers are helping. In our case it was *fewer* shared_buffers which helped. > As noted, I'm particularly interested in things we can test > outside of production environments, since I'm pretty skeptical the > Wisconsin Court System is going to allow the internet to log in > and repeat and verify test methodologies. Right, while it was a fairly scientific and methodical test, it was against a live production environment. We adjusted parameters incrementally, a little each day, from where they had been toward values which were calculated in advance to be better based on our theory of the problem (aided in no small part by analysis and advice from Greg Smith), and saw a small improvement each day with the problem disappearing entirely right at the target values we had calculated in advance. :-) > Point being: cranking buffers may have been the bee's knees with, > say, the 8.2 buffer manager, but present and future improvements > may have render that change moot or even counter productive. We did find that in 8.3 and later we can support a larger shared_buffer setting without the problem than in 8.2 and earlier. We still need to stay on the low side of what is often advocated to keep the failure rate from this issue at zero. > all else being equal, the lowest shared_buffers setting possible > without sacrificing performance is best because it releases more > memory to the o/s to be used for other things I absolutely agree with this. I think the problem is that it is very tedious and time-consuming to construct artificial tests for these things. Greg Smith has spent a lot of time and done a lot of research investigating the dynamics of these issues, and recommends a process of incremental adjustments for tuning the relevant settings which, in my opinion, is going to be better than any generalized advice on settings. Don't get me wrong, I would love to see numbers which "earned points" under the criteria you outline. I would especially love it if they could be part of the suite of tests in our performance farm. I just think that the wealth of anecdotal evidence and the dearth of repeatable benchmarks in this area is due to the relatively low-cost techniques available to tune production systems to solve pressing needs versus the relatively high cost of creating repeatable test cases (without, by the way, solving an immediate need). -Kevin
Merlin Moncure wrote: > So, the challenge is this: I'd like to see repeatable test cases that > demonstrate regular performance gains > 20%. Double bonus points for > cases that show gains > 50%. Do I run around challenging your suggestions and giving you homework? You have no idea how much eye rolling this whole message provoked from me. OK, so the key thing to do is create a table such that shared_buffers is smaller than the primary key index on a table, then UPDATE that table furiously. This will page constantly out of the buffer cache to the OS one, doing work that could be avoided. Increase shared_buffers to where it fits instead, and all the index writes are buffered to write only once per checkpoint. Server settings to exaggerate the effect: shared_buffers = 32MB checkpoint_segments = 256 log_checkpoints = on autovacuum = off Test case: createdb pgbench pgbench -i -s 20 pgbench psql -d pgbench -c "select pg_size_pretty(pg_relation_size('public.pgbench_accounts_pkey'))" psql -c "select pg_stat_reset_shared('bgwriter')" pgbench -T 120 -c 4 -n pgbench psql -x -c "SELECT * FROM pg_stat_bgwriter" This gives the following size for the primary key and results: pg_size_pretty ---------------- 34 MB transaction type: TPC-B (sort of) scaling factor: 20 query mode: simple number of clients: 4 number of threads: 1 duration: 120 s number of transactions actually processed: 13236 tps = 109.524954 (including connections establishing) tps = 109.548498 (excluding connections establishing) -[ RECORD 1 ]---------+------------------------------ checkpoints_timed | 0 checkpoints_req | 0 buffers_checkpoint | 0 buffers_clean | 16156 maxwritten_clean | 131 buffers_backend | 5701 buffers_backend_fsync | 0 buffers_alloc | 25276 stats_reset | 2011-05-26 18:39:57.292777-04 Now, change so the whole index fits instead: shared_buffers = 512MB ...which follows the good old "25% of RAM" guidelines given this system has 2GB of RAM. Restart the server, repeat the test case. New results: transaction type: TPC-B (sort of) scaling factor: 20 query mode: simple number of clients: 4 number of threads: 1 duration: 120 s number of transactions actually processed: 103440 tps = 861.834090 (including connections establishing) tps = 862.041716 (excluding connections establishing) gsmith@meddle:~/personal/scripts$ psql -x -c "SELECT * FROM pg_stat_bgwriter" -[ RECORD 1 ]---------+------------------------------ checkpoints_timed | 0 checkpoints_req | 0 buffers_checkpoint | 0 buffers_clean | 0 maxwritten_clean | 0 buffers_backend | 1160 buffers_backend_fsync | 0 buffers_alloc | 34071 stats_reset | 2011-05-26 18:43:40.887229-04 Rather than writing 16156+5701=21857 buffers out during the test to support all the index churn, instead only 1160 buffers go out, consisting mostly of the data blocks for pgbench_accounts that are being updated irregularly. With less than 1 / 18th as I/O to do, the system executes nearly 8X as many UPDATE statements during the test run. As for figuring out how this impacts more complicated cases, I hear somebody wrote a book or something that went into pages and pages of detail about all this. You might want to check it out. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
On Thu, May 26, 2011 at 4:10 PM, Greg Smith <greg@2ndquadrant.com> wrote:
As for figuring out how this impacts more complicated cases, I hear somebody wrote a book or something that went into pages and pages of detail about all this. You might want to check it out.
I was just going to suggest that there was significant and detailed documentation of this stuff in a certain book, a well-thumbed copy of which should be sitting on the desk of anyone attempting any kind of postgres performance tuning.
On Thu, May 26, 2011 at 6:10 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Merlin Moncure wrote: >> >> So, the challenge is this: I'd like to see repeatable test cases that >> demonstrate regular performance gains > 20%. Double bonus points for >> cases that show gains > 50%. > > Do I run around challenging your suggestions and giving you homework? You > have no idea how much eye rolling this whole message provoked from me. That's just plain unfair: I didn't challenge your suggestion nor give you homework. In particular, I'm not suggesting the 25%-ish default is wrong -- but trying to help people understand why it's there and what it's doing. I bet 19 people out of 20 could not explain what the primary effects of shared_buffers with any degree of accuracy. That group of people in fact would have included me until recently, when I started studying bufmgr.c for mostly unrelated reasons. Understand my basic points: *) the documentation should really explain this better (in particular, it should debunk the myth 'more buffers = more caching') *) the 'what' is not nearly so important as the 'why' *) the popular understanding of what buffers do is totally, completely, wrong *) I'd like to gather cases to benchmark changes that interact with these settings *) I think you are fighting the 'good fight'. I'm trying to help > OK, so the key thing to do is create a table such that shared_buffers is > smaller than the primary key index on a table, then UPDATE that table > furiously. This will page constantly out of the buffer cache to the OS one, > doing work that could be avoided. Increase shared_buffers to where it fits > instead, and all the index writes are buffered to write only once per > checkpoint. Server settings to exaggerate the effect: This is exactly what I'm looking for...that's really quite striking. I knew that buffer 'hit' before it goes out the door is what to gun for. > As for figuring out how this impacts more complicated cases, I hear somebody > wrote a book or something that went into pages and pages of detail about all > this. You might want to check it out. so i've heard: http://imgur.com/lGOqx (and yes: I 100% endorse the book: everyone who is serious about postgres should own a copy). Anyways, double points to you ;-). merlin
So how far do you go? 128MB? 32MB? 4MB? Anecdotal and an assumption, but I'm pretty confident that on any server with at least 1GB of dedicated RAM, setting it any lower than 200MB is not even going to help latency (assuming checkpoint and log configuration is in the realm of sane, and connections*work_mem is sane). The defaults have been so small for so long on most platforms, that any increase over the default generally helps performance -- and in many cases dramatically. So if more is better, then most users assume that even more should be better. But its not so simple, there are drawbacks to a larger buffer and diminishing returns with larger size. I think listing the drawbacks of a larger buffer and symptoms that can result would be a big win. And there is an OS component to it too. You can actually get away with shared_buffers at 90% of RAM on Solaris. Linux will explode if you try that (unless recent kernels have fixed its shared memory accounting). On 5/26/11 8:10 AM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: >Merlin Moncure <mmoncure@gmail.com> wrote: > >> So, the challenge is this: I'd like to see repeatable test cases >> that demonstrate regular performance gains > 20%. Double bonus >> points for cases that show gains > 50%. > >Are you talking throughput, maximum latency, or some other metric? > >In our shop the metric we tuned for in reducing shared_buffers was >getting the number of "fast" queries (which normally run in under a >millisecond) which would occasionally, in clusters, take over 20 >seconds (and thus be canceled by our web app and present as errors >to the public) down to zero. While I know there are those who care >primarily about throughput numbers, that's worthless to me without >maximum latency information under prolonged load. I'm not talking >90th percentile latency numbers, either -- if 10% of our web >requests were timing out the villagers would be coming after us with >pitchforks and torches. > >-Kevin > >-- >Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-performance
Scott Carey <scott@richrelevance.com> wrote: > So how far do you go? 128MB? 32MB? 4MB? Under 8.2 we had to keep shared_buffers less than the RAM on our BBU RAID controller, which had 256MB -- so it worked best with shared_buffers in the 160MB to 200MB range. With 8.3 we found that anywhere from 512MB to 1GB performed better without creating clusters of stalls. In both cases we also had to significantly boost the aggressiveness of the background writer. Since the "sweet spot" is so dependent on such things as your RAID controller and your workload, I *highly* recommend Greg's incremental tuning approach. The rough guidelines which get tossed about make reasonable starting points, but you really need to make relatively small changes with the actual load you're trying to optimize and monitor the metrics which matter to you. On a big data warehouse you might not care if the database becomes unresponsive for a couple minutes every now and then if it means better overall throughput. On a web server, you may not have much problem keeping up with the overall load, but want to ensure reasonable response time. > Anecdotal and an assumption, but I'm pretty confident that on any > server with at least 1GB of dedicated RAM, setting it any lower > than 200MB is not even going to help latency (assuming checkpoint > and log configuration is in the realm of sane, and > connections*work_mem is sane). I would add the assumption that you've got at least 256MB BBU cache on your RAID controller. > The defaults have been so small for so long on most platforms, > that any increase over the default generally helps performance -- > and in many cases dramatically. Agreed. > So if more is better, then most users assume that even more should > be better. That does seem like a real risk. -Kevin
Scott Carey wrote: > And there is an OS component to it too. You can actually get away with > shared_buffers at 90% of RAM on Solaris. Linux will explode if you try > that (unless recent kernels have fixed its shared memory accounting). > You can use much larger values for shared_buffers on Solaris with UFS as the filesystem than almost anywhere else, as you say. UFS defaults to caching an extremely tiny amount of memory by default. Getting PostgreSQL to buffer everything therefore leads to minimal double-caching and little write caching that creates checkpoint spikes, so 90% is not impossible there. If you're using ZFS instead, that defaults to similar aggressive caching as Linux. You may even have to turn that down if you want the database to have a large amount of memory for its own use even with normal levels of sizing; just space for shared_buffers and work_mem can end up being too large of a pair of competitors for caching RAM. ZFS is not really not tuned all that differently from how Linux approaches caching in that regard. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Merlin Moncure wrote: > That's just plain unfair: I didn't challenge your suggestion nor give > you homework. I was stuck either responding to your challenge, or leaving the impression I hadn't done the research to back the suggestions I make if I didn't. That made it a mandatory homework assignment for me, and I didn't appreciate that. > *) the documentation should really explain this better (in particular, > it should debunk the myth 'more buffers = more caching' Any attempt to make a serious change to the documentation around performance turns into a bikeshedding epic, where the burden of proof to make a change is too large to be worth the trouble to me anymore. I first started publishing tuning papers outside of the main docs because it was the path of least resistance to actually getting something useful in front of people. After failing to get even basic good recommendations for checkpoint_segments into the docs, I completely gave up on focusing there as my primary way to spread this sort of information. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
On Fri, May 27, 2011 at 1:47 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Merlin Moncure wrote: >> >> That's just plain unfair: I didn't challenge your suggestion nor give >> you homework. > > I was stuck either responding to your challenge, or leaving the impression I > hadn't done the research to back the suggestions I make if I didn't. That > made it a mandatory homework assignment for me, and I didn't appreciate > that. really -- that wasn't my intent. in any event, i apologize. merlin
On Fri, May 27, 2011 at 2:47 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Any attempt to make a serious change to the documentation around performance > turns into a bikeshedding epic, where the burden of proof to make a change > is too large to be worth the trouble to me anymore. I first started > publishing tuning papers outside of the main docs because it was the path of > least resistance to actually getting something useful in front of people. > After failing to get even basic good recommendations for > checkpoint_segments into the docs, I completely gave up on focusing there as > my primary way to spread this sort of information. Hmm. That's rather unfortunate. +1 for revisiting that topic, if you have the energy for it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>> After failing to get even basic good recommendations for >> checkpoint_segments into the docs, I completely gave up on focusing there as >> my primary way to spread this sort of information. > > Hmm. That's rather unfortunate. +1 for revisiting that topic, if you > have the energy for it. Another +1. While I understand that this is not simple, many users will not look outside of standard docs, especially when first evaluating PostgreSQL. Merlin is right that the current wording does not really mention a down side to cranking shared_buffers on a system with plenty of RAM. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
On Fri, May 27, 2011 at 9:24 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote: > Another +1. While I understand that this is not simple, many users > will not look outside of standard docs, especially when first > evaluating PostgreSQL. Merlin is right that the current wording does > not really mention a down side to cranking shared_buffers on a system > with plenty of RAM. If you read the whole docs it does. If you read caching, checkpoints, WAL, all of it, you can connect the dots. It isn't easy, but database management isn't easy.
On 27/05/11 11:10, Greg Smith wrote: > > OK, so the key thing to do is create a table such that shared_buffers > is smaller than the primary key index on a table, then UPDATE that > table furiously. This will page constantly out of the buffer cache to > the OS one, doing work that could be avoided. Increase shared_buffers > to where it fits instead, and all the index writes are buffered to > write only once per checkpoint. Server settings to exaggerate the > effect: > > shared_buffers = 32MB > checkpoint_segments = 256 > log_checkpoints = on > autovacuum = off > > Test case: > > createdb pgbench > pgbench -i -s 20 pgbench > psql -d pgbench -c "select > pg_size_pretty(pg_relation_size('public.pgbench_accounts_pkey'))" > psql -c "select pg_stat_reset_shared('bgwriter')" > pgbench -T 120 -c 4 -n pgbench > psql -x -c "SELECT * FROM pg_stat_bgwriter" > > This gives the following size for the primary key and results: > > pg_size_pretty > ---------------- > 34 MB > > transaction type: TPC-B (sort of) > scaling factor: 20 > query mode: simple > number of clients: 4 > number of threads: 1 > duration: 120 s > number of transactions actually processed: 13236 > tps = 109.524954 (including connections establishing) > tps = 109.548498 (excluding connections establishing) > > -[ RECORD 1 ]---------+------------------------------ > checkpoints_timed | 0 > checkpoints_req | 0 > buffers_checkpoint | 0 > buffers_clean | 16156 > maxwritten_clean | 131 > buffers_backend | 5701 > buffers_backend_fsync | 0 > buffers_alloc | 25276 > stats_reset | 2011-05-26 18:39:57.292777-04 > > Now, change so the whole index fits instead: > > shared_buffers = 512MB > > ...which follows the good old "25% of RAM" guidelines given this > system has 2GB of RAM. Restart the server, repeat the test case. New > results: > > transaction type: TPC-B (sort of) > scaling factor: 20 > query mode: simple > number of clients: 4 > number of threads: 1 > duration: 120 s > number of transactions actually processed: 103440 > tps = 861.834090 (including connections establishing) > tps = 862.041716 (excluding connections establishing) > > gsmith@meddle:~/personal/scripts$ psql -x -c "SELECT * FROM > pg_stat_bgwriter" > -[ RECORD 1 ]---------+------------------------------ > checkpoints_timed | 0 > checkpoints_req | 0 > buffers_checkpoint | 0 > buffers_clean | 0 > maxwritten_clean | 0 > buffers_backend | 1160 > buffers_backend_fsync | 0 > buffers_alloc | 34071 > stats_reset | 2011-05-26 18:43:40.887229-04 > > Rather than writing 16156+5701=21857 buffers out during the test to > support all the index churn, instead only 1160 buffers go out, > consisting mostly of the data blocks for pgbench_accounts that are > being updated irregularly. With less than 1 / 18th as I/O to do, the > system executes nearly 8X as many UPDATE statements during the test run. > > As for figuring out how this impacts more complicated cases, I hear > somebody wrote a book or something that went into pages and pages of > detail about all this. You might want to check it out. > Greg, having an example with some discussion like this in the docs would probably be helpful. If you want to add it that would be great, however that sounds dangerously like giving you homework :-) I'm happy to put something together for the docs if you'd prefer that I do my own assignments. Cheers Mark
On Thu, 2011-05-26 at 09:31 -0500, Merlin Moncure wrote: > Where they are most helpful is for masking of i/o if > a page gets dirtied >1 times before it's written out to the heap Another possible benefit of higher shared_buffers is that it may reduce WAL flushes. A page cannot be evicted from shared_buffers until the WAL has been flushed up to the page's LSN ("WAL before data"); so if there is memory pressure to evict dirty buffers, it may cause extra WAL flushes. I'm not sure what the practical effects of this are, however, but it might be an interesting thing to test. Regards, Jeff Davis
On 05/27/2011 07:30 PM, Mark Kirkwood wrote: > Greg, having an example with some discussion like this in the docs > would probably be helpful. If we put that example into the docs, two years from now there will be people showing up here saying "I used the recommended configuration from the docs" that cut and paste it into their postgresql.conf, turning autovacuum off and everything. Periodically people used to publish "recommended postgresql.conf" settings on random web pages, sometimes with poor suggestions, and those things kept showing up in people's configurations posted to the lists here for long after they were no longer applicable. I've resisted publishing specific configuration examples in favor of working on pgtune specifically because of having observed that. There's several new small features in 9.1 that make it a easier to instrument checkpoint behavior and how it overlaps with shared_buffers increases: summary of sync times, ability to reset pg_stat_bgwriter, and a timestamp on when it was last reset. It's not obvious at all how those all stitch together into some new tuning approaches, but they do. Best example I've given so far is at http://archives.postgresql.org/pgsql-hackers/2011-02/msg00209.php ; look at how I can turn the mysterious buffers_backend field into something measured in MB/s using these new features. That's the direction all this is moving toward. If it's easy for people to turn the various buffer statistics into human-readable form, the way tuning changes impact the server operation becomes much easier to see. Documenting the much harder to execute methodology you can apply to earlier versions isn't real exciting to me at this point. I've done that enough that people who want the info can find it, even if it's not all in the manual. The ways you can do it in 9.1 are so much easier, and more accurate in regards to the sync issues, that I'm more interested in beefing up the manual in regards to using them at this point. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
On Fri, May 27, 2011 at 7:19 PM, Jeff Davis <pgsql@j-davis.com> wrote: > On Thu, 2011-05-26 at 09:31 -0500, Merlin Moncure wrote: >> Where they are most helpful is for masking of i/o if >> a page gets dirtied >1 times before it's written out to the heap > > Another possible benefit of higher shared_buffers is that it may reduce > WAL flushes. A page cannot be evicted from shared_buffers until the WAL > has been flushed up to the page's LSN ("WAL before data"); so if there > is memory pressure to evict dirty buffers, it may cause extra WAL > flushes. > > I'm not sure what the practical effects of this are, however, but it > might be an interesting thing to test. Hm, I bet it could make a fairly big difference if wal data is not on a separate volume. merlin