On Wed, Apr 19, 2023 at 10:23:26AM -0700, Andres Freund wrote:
> Hi,
>
> I noticed that the numbers in pg_stat_io dont't quite add up to what I
> expected in write heavy workloads. Particularly for checkpointer, the numbers
> for "write" in log_checkpoints output are larger than what is visible in
> pg_stat_io.
>
> That partially is because log_checkpoints' "write" covers way too many things,
> but there's an issue with pg_stat_io as well:
>
> Checkpoints, and some other sources of writes, will often end up doing a lot
> of smgrwriteback() calls - which pg_stat_io doesn't track. Nor do any
> pre-existing forms of IO statistics.
>
> It seems pretty clear that we should track writeback as well. I wonder if it's
> worth doing so for 16? It'd give a more complete picture that way. The
> counter-argument I see is that we didn't track the time for it in existing
> stats either, and that nobody complained - but I suspect that's mostly because
> nobody knew to look.
Not complaining about making pg_stat_io more accurate, but what exactly
would we be tracking for smgrwriteback()? I assume you are talking about
IO timing. AFAICT, on Linux, it does sync_file_range() with
SYNC_FILE_RANGE_WRITE, which is asynchronous. Wouldn't we just be
tracking the system call overhead time?
- Melanie