I think you're misunderstanding how spread checkpoints work.
Yep, definitely:-) On the other hand I though I was seeking something "simple", namely correct latency under small load, that I would expect out of the box.
What you describe is reasonable, and is more or less what I was hoping for, although I thought that bgwriter was involved from the start and checkpoint would only do what is needed in the end. My mistake.
If all you want is to avoid the write storms when fsyncs start happening on slow storage, can you not just adjust the kernel vm.dirty* tunables to start making the kernel write out dirty buffers much sooner instead of letting them accumulate until fsyncs force them out all at once?