Thread: PreallocXlogFiles
I notice this: When a checkpoint occurs, if a log file is more than 75% full then a new file will be allocated (in PreallocXlogFiles). This assumes we checkpoint at least 4 times per log file, otherwise it will be effectively random whether we actually ever do this or not. With an uneven or bursty workload, we would need to checkpoint many more times per xlog to even notice this is ever being called. (I never have). ...but we don't check that anywhere in the code. Since checkpoints now default to every 300 seconds, we are assuming that a log file takes at least 20 minutes to fill with an even workload, which is not the case on busy systems. On slow systems, who cares whether we preallocate or not? Especially now that we have the bgwriter to smooth the workload of backends. The idea was to preallocate a file ahead of it being required...mostly we just hit the endspot without having preallocated any log files, so the preallocation thing is just a waste of time. PreallocXlogFiles is only ever called during a normal Checkpoint or after Recovery. In both cases, there will always be xlogs recycled and so preallocation has already taken place (except in the trivial case of the first few xlogs after an initdb). I would like to remove PreallocXlogFiles on the basis that it is dead, or at least pointless code. Objections? Best Regards, Simon Riggs
On Wed, 21 Jul 2004, Simon Riggs wrote: > I notice this: > > When a checkpoint occurs, if a log file is more than 75% full then a new > file will be allocated (in PreallocXlogFiles). > > This assumes we checkpoint at least 4 times per log file, otherwise it > will be effectively random whether we actually ever do this or not. With > an uneven or bursty workload, we would need to checkpoint many more > times per xlog to even notice this is ever being called. (I never have). > > ...but we don't check that anywhere in the code. I prefer the idea of just checking it more often than pulling the code out all together. I think this sits well with Jan's work on consistent availability (buffer manager, vacuum delay). The question is, where to call it from. Its possible that the buffer manager may have enough information to guess how often a new checkpoint file should be preallocated. The alternative would be to have (yet another) backend look after this. Or, maybe the autovacuum backend could look after this. It would have access to stats which may be useful but it would mean that people would have to run autovacuum if they wanted checkpoints preallocated. Thanks, Gavin
Simon Riggs <simon@2ndquadrant.com> writes: > I would like to remove PreallocXlogFiles on the basis that it is dead, > or at least pointless code. It could stand improvement I'm sure, but it's not pointless, particularly not when you have archive mode turned on and so dead xlog segments can't necessarily be recycled immediately. There's no guarantee that there are very many segments available to be recycled when a checkpoint happens, and so if you don't do some preallocation you may find foreground processes forced to do the work instead when they run out of forward xlog space. If you assume a reasonably steady flow of xlog traffic and no significant archiving delays, then you can see that the system settles into a steady state where at each checkpoint about the same number of old WAL files get rotated around to become forward xlog space, and indeed there's little need for PreallocXlogFiles because MoveOfflineLogs does all the heavy lifting. However, I'm not at all convinced that this analysis holds up with bursty traffic or when the archiver is delaying rotation of old xlogs. If the number of physical WAL files needs to grow and shrink because of such effects, then PreallocXlogFiles is the only thing that can prevent foreground processes from having to do the work that should be handled by the checkpointer. I wonder whether we should not put back the preallocated-files GUC parameter that Bruce took out a release or two back. PreallocXlogFiles made a lot more sense back when that parameter existed. regards, tom lane
On Thu, 2004-07-22 at 00:35, Gavin Sherry wrote: > On Wed, 21 Jul 2004, Simon Riggs wrote: > > > I notice this: > > > > When a checkpoint occurs, if a log file is more than 75% full then a new > > file will be allocated (in PreallocXlogFiles). > > > > This assumes we checkpoint at least 4 times per log file, otherwise it > > will be effectively random whether we actually ever do this or not. With > > an uneven or bursty workload, we would need to checkpoint many more > > times per xlog to even notice this is ever being called. (I never have). > > > > ...but we don't check that anywhere in the code. > > I prefer the idea of just checking it more often than pulling the code out > all together. I think this sits well with Jan's work on consistent > availability (buffer manager, vacuum delay). > Good idea. Hey - we could get archiver to do this, seeing as it knows when the logs are full. Just do: I've seen a full one, I'll prealloc another. No test, just alloc. (Or the bgwriter...) On Thu, 2004-07-22 at 00:53, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > I would like to remove PreallocXlogFiles on the basis that it is dead, > > or at least pointless code. > > I wonder whether we should not put back the preallocated-files GUC > parameter that Bruce took out a release or two back. PreallocXlogFiles > made a lot more sense back when that parameter existed. That's simplest, especially if the code is there. But again, if you set it to a constant value it's not really responding to system demands, its just the admin's guess of what to set it to. Gavin's idea sounds more optimal... However, I'm not at all convinced that this analysis holds up with > bursty traffic or when the archiver is delaying rotation of old xlogs. > If the number of physical WAL files needs to grow and shrink because > of such effects, then PreallocXlogFiles is the only thing that can > prevent foreground processes from having to do the work that should > be handled by the checkpointer. Yes, I agree, but the checkpointer isn't waking up often enough currently to do this effectively. It's just randomly doing it. Best regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > Yes, I agree, but the checkpointer isn't waking up often enough > currently to do this effectively. It's just randomly doing it. Agreed. Maybe it should be part of the bgwriter's idle loop, and not directly associated with checkpoints at all. regards, tom lane
On Thu, 2004-07-22 at 01:44, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Yes, I agree, but the checkpointer isn't waking up often enough > > currently to do this effectively. It's just randomly doing it. > > Agreed. Maybe it should be part of the bgwriter's idle loop, and > not directly associated with checkpoints at all. > Yes, thats a more natural home, now bgwriter exists. But does it know when log files are full? How would it know? Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > On Thu, 2004-07-22 at 01:44, Tom Lane wrote: >> Agreed. Maybe it should be part of the bgwriter's idle loop, and >> not directly associated with checkpoints at all. > Yes, thats a more natural home, now bgwriter exists. But does it know > when log files are full? How would it know? It can run PreallocXlogFiles --- or more likely a modified version of same. There isn't anything that function needs to do that the bgwriter can't do (in fact, the bgwriter is what runs checkpoints now...) regards, tom lane
On Thu, 2004-07-22 at 15:19, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Thu, 2004-07-22 at 01:44, Tom Lane wrote: > >> Agreed. Maybe it should be part of the bgwriter's idle loop, and > >> not directly associated with checkpoints at all. > > > Yes, thats a more natural home, now bgwriter exists. But does it know > > when log files are full? How would it know? > > It can run PreallocXlogFiles --- or more likely a modified version of > same. There isn't anything that function needs to do that the bgwriter > can't do (in fact, the bgwriter is what runs checkpoints now...) > I can see roughly how to do this, but it is a can of worms I don't want to open when I dont have much time. Some thoughts and ideas for later: The Checkpoint code writes to xlog, so finds out what the recptr is for free, then tries to act on that knowledge in PreallocXlogFiles. Calling PreallocXlogFiles outside of the Checkpoint code is straightforward to initiate from bgwriter.c, but the caller must have already obtained the current recptr position. That would require attempting to gain a lock on XLogCtl, then releasing it quickly after having read the pointer. Then call Prealloc... ...Unless there is a heuristic to use, rather than exact knowledge of the recptr...perhaps predicting something from the last 3 checkpoint durations perhaps? I'll return to this later. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > Calling PreallocXlogFiles outside of the Checkpoint code is > straightforward to initiate from bgwriter.c, but the caller must have > already obtained the current recptr position. That would require > attempting to gain a lock on XLogCtl, then releasing it quickly after > having read the pointer. Then call Prealloc... When I said "modified version", I meant that we'd change the function to make it self-contained. Passing an already-obtained recptr is convenient when it's being invoked at the end of Checkpoint, but to be called from the bgwriter loop it should just get the necessary lock and fetch the pointer for itself. regards, tom lane
On Sat, 2004-07-24 at 15:22, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Calling PreallocXlogFiles outside of the Checkpoint code is > > straightforward to initiate from bgwriter.c, but the caller must have > > already obtained the current recptr position. That would require > > attempting to gain a lock on XLogCtl, then releasing it quickly after > > having read the pointer. Then call Prealloc... > > When I said "modified version", I meant that we'd change the function > to make it self-contained. Passing an already-obtained recptr is > convenient when it's being invoked at the end of Checkpoint, but to > be called from the bgwriter loop it should just get the necessary lock > and fetch the pointer for itself. > Whichever...I envisaged a new wrapper function in xlog.c, called from bgwriter.c, rather than changing Prealloc.. Your way sounds better. Leave parms the same, just put an if recptr==NULL then {get recptr} section of code. Main point: nearly out of time, if I'm to finish other things on must-complete list: docs and backup start/end function design. Best Regards, Simon Riggs