Hi,
While working on checkpoint related stuff, I have encountered that
there is some inconsistency while reporting checkpointer stats. When a
checkpoint gets completed, a checkpoint complete message gets logged.
This message has a lot of information including the buffers written
(CheckpointStats.ckpt_bufs_written). This variable gets incremented in
2 cases. First is in BufferSync() and the second is in
SlruInternalWritePage(). On the other hand the checkpointer stats
exposed using pg_stat_bgwriter contains a lot of information including
buffers written (PendingCheckpointerStats.buf_written_checkpoints).
This variable gets incremented in only one place and that is in
BufferSync(). So there is inconsistent behaviour between these two
data. Please refer to the sample output below.
postgres=# select * from pg_stat_bgwriter;
checkpoints_timed | checkpoints_req | checkpoint_write_time |
checkpoint_sync_time | buffers_checkpoint | buffers_clean |
maxwritten_clean | buffers_backend | buffers_backend_fsync |
buffers_alloc | stats_reset
-------------------+-----------------+-----------------------+----------------------+--------------------+---------------+------------------+-----------------+-----------------------+---------------+-------------------------------
0 | 1 | 75 |
176 | 4702 | 0 | 0 |
4656 | 0 | 5023 | 2022-12-14
07:01:01.494672+00
2022-12-14 07:03:18.052 UTC [6087] LOG: checkpoint starting:
immediate force wait
2022-12-14 07:03:18.370 UTC [6087] LOG: checkpoint complete: wrote
4705 buffers (28.7%); 0 WAL file(s) added, 0 removed, 4 recycled;
write=0.075 s, sync=0.176 s, total=0.318 s; sync files=34,
longest=0.159 s, average=0.006 s; distance=66180 kB, estimate=66180
kB; lsn=0/5565E38, redo lsn=0/5565E00
In order to fix this, the
PendingCheckpointerStats.buf_written_checkpoints should be incremented
in SlruInternalWritePage() similar to
CheckpointStats.ckpt_bufs_written. I have attached the patch for the
same. Please share your thoughts.
Thanks & Regards,
Nitin Jadhav