Hello Robert,
>> I think that the 1.5 value somewhere in the patch is much too high for the
>> purpose because it shifts the checkpoint load quite a lot (50% more load at
>> the end of the checkpoint) just for the purpose of avoiding a spike which
>> lasts a few seconds (I think) at the beginning. A much smaller value should
>> be used (1.0 <= factor < 1.1), as it would be much less disruptive and would
>> probably avoid the issue just the same. I recommend not to commit with a 1.5
>> factor in any case.
>
> Wait, what? On what workload does the FPW spike last only a few
> seconds? [...]
Ok. AFAICR, a relatively small part at the beginning of the checkpoint,
but possibly more that a few seconds.
My actual point is that it should be tested with different and especially
smaller values, because 1.5 changes the overall load distribution *a lot*.
For testing purpose I suggested that a guc would help, but the patch
author has never been back to intervene on the thread, discuss the
arguments not provide another patch.
>> Another issue I raised is that the load change occurs both with xlog and
>> time triggered checkpoints, and I'm sure it should be applied in both case.
>
> Is this sentence missing a "not"?
Indeed. I think that it make sense for xlog triggered checkpoints, but
less so with time triggered checkpoints. I may be wrong, but I think that
this deserve careful analysis.
--
Fabien.