I notice this:
When a checkpoint occurs, if a log file is more than 75% full then a new
file will be allocated (in PreallocXlogFiles).
This assumes we checkpoint at least 4 times per log file, otherwise it
will be effectively random whether we actually ever do this or not. With
an uneven or bursty workload, we would need to checkpoint many more
times per xlog to even notice this is ever being called. (I never have).
...but we don't check that anywhere in the code.
Since checkpoints now default to every 300 seconds, we are assuming that
a log file takes at least 20 minutes to fill with an even workload,
which is not the case on busy systems. On slow systems, who cares
whether we preallocate or not? Especially now that we have the bgwriter
to smooth the workload of backends.
The idea was to preallocate a file ahead of it being required...mostly
we just hit the endspot without having preallocated any log files, so
the preallocation thing is just a waste of time.
PreallocXlogFiles is only ever called during a normal Checkpoint or
after Recovery. In both cases, there will always be xlogs recycled and
so preallocation has already taken place (except in the trivial case of
the first few xlogs after an initdb).
I would like to remove PreallocXlogFiles on the basis that it is dead,
or at least pointless code.
Objections?
Best Regards, Simon Riggs