Hello Amit,
> [...]
>> The objective is to help avoid PG stalling when fsyncing on checkpoints,
>> and in general to get better latency-bound performance.
>
> Won't this lead to more-unsorted writes (random I/O) as the
> FlushBuffer requests (by checkpointer or bgwriter) are not sorted as
> per files or order of blocks on disk?
Yep, probably. Under "moderate load" this is not an issue. The
io-scheduler and other hd firmware will probably reorder writes anyway.
Also, if several data are updated together, probably they are likely to be
already neighbours in memory as well as on disk.
> I remember sometime back there was some discusion regarding
> sorting writes during checkpoint, one idea could be try to
> check this idea along with that patch. I just saw that Andres has
> also given same suggestion which indicates that it is important
> to see both the things together.
I would rather separate them, unless this is a blocker. This version seems
already quite effective and very light. ISTM that adding a sort phase
would mean reworking significantly how the checkpointer processes pages.
> Also here another related point is that I think currently even fsync
> requests are not in order of the files as they are stored on disk so
> that also might cause random I/O?
I think that currently the fsync is on the file handler, so what happens
depends on how fsync is implemented by the system.
> Yet another idea could be to allow BGWriter to also fsync the dirty
> buffers,
ISTM That it is done with this patch with "bgwriter_flush_to_disk=on".
> that may have side impact of not able to clear the dirty pages at speed
> required by system, but I think if that happens one can think of having
> multiple BGwriter tasks.
--
Fabien.