Re: [Testperf-general] BufferSync and bgwriter - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: [Testperf-general] BufferSync and bgwriter |
Date | |
Msg-id | 1102889288.4037.2806.camel@localhost.localdomain Whole thread Raw |
In response to | Re: [Testperf-general] BufferSync and bgwriter (Neil Conway <neilc@samurai.com>) |
Responses |
Re: [Testperf-general] BufferSync and bgwriter
Re: [Testperf-general] BufferSync and bgwriter Re: [Testperf-general] BufferSync and bgwriter |
List | pgsql-hackers |
> On Sun, 2004-12-12 at 05:46, Neil Conway wrote: > Simon Riggs wrote: > > If the bgwriter_percent = 100, then we should actually do the sensible > > thing and prepare the list that we need, i.e. limit > > StrategyDirtyBufferList to finding at most bgwriter_maxpages. > > Is the plan to make bgwriter_percent = 100 the default setting? Hmm...must confess that my only plan is: i) discover dynamic behaviour of bgwriter ii) fix any bugs or wierdness as quickly as possible iii) try to find a way to set the bgwriter defaults I'm worried that we're late in the day for changes, but I'm equally worried that a) the bgwriter is very tuning sensitive b) we don't really have much info on how to set the defaults in a meaningful way for the majority of cases c) there are some issues that greatly reduce the effectiveness of the bgwriter in many circumstances. The 100pct.patch was my first attempt at getting something acceptable in the next few days that gives sufficient room for the DBA to perform tuning. On Sun, 2004-12-12 at 05:46, Neil Conway wrote: > I wonder if we even need to retain the bgwriter_percent GUC var. Is > there actually a situation in which the combination of bgwriter_maxpages > and bgwriter_delay does not give the DBA sufficient flexibility in > tuning bgwriter behavior? Yes, I do now think that only two GUCs are required to tune the behaviour; but you make me think - which two? Right now, bgwriter_delay is useless because the O(N) behaviour makes it impossible to set any lower when you have a large shared_buffers. (I see that as a bug) Your question has made me rethink the exact objective of the bgwriter's actions: The way it is coded now the bgwriter looks for dirty blocks, no matter where they are in the list. What we are bothered about is the number of clean buffers at the LRU, which has a direct influence on the probability that BufferAlloc() will need to call FlushBuffer(), since StrategyGetBuffer() returns the first unpinned buffer, dirty or not. After further thought, I would prefer a subtle change in behaviour so that the bgwriter checks that clean blocks are available at the LRUs for when buffer replacement occurs. With that slight change, I'd keep the bgwriter_percent GUC but make it mean something different. bgwriter_percent would be the % of shared_buffers that are searched (from the LRU end) to see if they contain dirty buffers, which are then written to disk. That means the number of dirty blocks written to disk is between 0 and the number of buffers searched, but we're not hugely bothered what that number is... [This change to StrategyDirtyBufferList resolves the unusability of the bgwriter with large shared_buffers] Writing away dirty blocks towards the MRU end is more likely to be wasted effort. If a block stays near the MRU then it will be dirty again in the wink of an eye, so you gain nothing at checkpoint time by cleaning it. Also, since it isn't near the LRU, cleaning it has no effect on buffer replacement I/O. If a block is at the LRU, then it is by definition the least likely to be reused, and is a candidate for replacement anyway. So concentrating on the LRU, not the number of dirty buffers seems to be the better thing to do. That would then be a much simpler way of setting the defaults. With that definition, we would set the defaults: bgwriter_percent = 2 (according to my new suggestion here) bgwriter_delay = 200 bgwriter_maxpages = -1 (i.e. mostly ignore it, but keep it for fine tuning) Thus, for the default shared_buffers=1000 the bgwriter would clear a space of up to 20 blocks each cycle. For a config with shared_buffers=60000, the bgwriter default would clear space for 600 blocks (max) each cycle - a reasonable setting. Overall that would need very little specific tuning, because it would scale upwards as you changed the shared_buffers higher. So, that interpretation of bgwriter_percent gives these advantages: - we bound the StrategyDirtyBufferList scan to a small % of the whole list, rather than the whole list...so we could realistically set the bgwriter_delay lower if required - we can set a default that scales, so would not often need to change it - the parameter is defined in terms of the thing we really care about: sufficient clean blocks at the LRU of the buffer lists - these changes are very isolated and actually minor - just a different way of specifying which buffers the bgwriter will clean Patch attached...again for discussion and to help understanding of this proposal. Will submit to patches if we agree it seems like the best way to allow the bgwriter defaults to be sensibly set. [...and yes, everybody, I do know where we are in the release cycle] -- Best Regards, Simon Riggs
Attachment
pgsql-hackers by date: