Re: checkpoints/bgwriter tuning verification - Mailing list pgsql-general
From | Greg Smith |
---|---|
Subject | Re: checkpoints/bgwriter tuning verification |
Date | |
Msg-id | alpine.GSO.2.01.0910292208480.23558@westnet.com Whole thread Raw |
In response to | checkpoints/bgwriter tuning verification (Vick Khera <vivek@khera.org>) |
Responses |
Re: checkpoints/bgwriter tuning verification
Re: checkpoints/bgwriter tuning verification |
List | pgsql-general |
On Thu, 29 Oct 2009, Vick Khera wrote: > Greg, do you have a performance tuning book? If so, I really want to > buy it! Your articles are awesome. Give me a few more months... > basically, the next checkpoint starts within a few seconds of the prior > one completing. That's the expected behavior with a high setting for the completion target. Each checkpoint finishes just before the next one needs to start. That's the most you can spread them out, and more spreading generally equals better behavior. Usually; always exceptions of course. > Based on Greg's article and the above number showing that most > checkpoints are triggered by running out of WAL segments, I should > increase my checkpoint_buffers. Also, based on the article, I should > increase the bgwriter_lru_maxpages (currently at default 100). You might actually want to decrease bgwriter_lru_maxpages to 0. The current background writer designs presumes you have some spare I/O capacity and want to trade off some write overhead for possibly lower latency in client backends. If you're at the point where the disks aren't keeping up with the checkpoint disk I/O, this could turn counter-productive. > My plan is to bump checkpoint_segments to 48 and reduce > checkpoint_completion_target to 0.7, and bump the > bgwriter_lru_maxpages to 500. In your situation, I would increase checkpoint_segments to 64, keep the completion target at 0.9, and decrease bgwriter_lru_maxpages to 0 so that feature is turned off altogether. If you're still not happy, you can try dropping the completion target too; I've heard one report of that logic not working out so well in practice, where lower settings actually resulted in less of an I/O spike. I would worry about the OS tuning first though, which brins us to: > Can the checkpoint operation actually cause the DB to stop responding > for a few seconds at a time? That seems to be what I observe. > Sometimes for 5 or more seconds one transaction will just stall. There are two ways that that writes can hang: 1) You've gotten to the point in the checkpoint cycle where it's calling fsync to flush everything out of the filesystem. At this point you could potentially have a big chunk of data that needs to go out to disk, and any other client that needs to write something (or needs a resource locked by a writer) is stuck behind that. 2) The entire OS write cache is filled. Once that happens, if a client tries to write something else, the OS will force it to nap until there's space again to hold that write. How that all works really depends on the OS. I know that under Linux, the way this is all handled is really bad, and can easily lead to multi-second hangs. Generally you can watch the writes accumulate by looking at /proc/meminfo. I wrote an example showing how the problem appears and what I did to improve long hangs one server at http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html and it refers to a long theory page on Linux's pdflush on my web page. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
pgsql-general by date: