Inconsistency in reporting checkpointer stats - Mailing list pgsql-hackers

From Nitin Jadhav
Subject Inconsistency in reporting checkpointer stats
Date
Msg-id CAMm1aWb18EpT0whJrjG+-nyhNouXET6ZUw0pNYYAe+NezpvsAA@mail.gmail.com
Whole thread Raw
Responses Re: Inconsistency in reporting checkpointer stats  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Re: Inconsistency in reporting checkpointer stats  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

While working on checkpoint related stuff, I have encountered that
there is some inconsistency while reporting checkpointer stats. When a
checkpoint gets completed, a checkpoint complete message gets logged.
This message has a lot of information including the buffers written
(CheckpointStats.ckpt_bufs_written). This variable gets incremented in
2 cases. First is in BufferSync() and the second is in
SlruInternalWritePage(). On the other hand the checkpointer stats
exposed using pg_stat_bgwriter contains a lot of information including
buffers written (PendingCheckpointerStats.buf_written_checkpoints).
This variable gets incremented in only one place and that is in
BufferSync(). So there is inconsistent behaviour between these two
data. Please refer to the sample output below.

postgres=# select * from pg_stat_bgwriter;
 checkpoints_timed | checkpoints_req | checkpoint_write_time |
checkpoint_sync_time | buffers_checkpoint | buffers_clean |
maxwritten_clean | buffers_backend | buffers_backend_fsync |
buffers_alloc |          stats_reset

-------------------+-----------------+-----------------------+----------------------+--------------------+---------------+------------------+-----------------+-----------------------+---------------+-------------------------------
                 0 |               1 |                    75 |
         176 |               4702 |             0 |                0 |
           4656 |                     0 |          5023 | 2022-12-14
07:01:01.494672+00

2022-12-14 07:03:18.052 UTC [6087] LOG:  checkpoint starting:
immediate force wait
2022-12-14 07:03:18.370 UTC [6087] LOG:  checkpoint complete: wrote
4705 buffers (28.7%); 0 WAL file(s) added, 0 removed, 4 recycled;
write=0.075 s, sync=0.176 s, total=0.318 s; sync files=34,
longest=0.159 s, average=0.006 s; distance=66180 kB, estimate=66180
kB; lsn=0/5565E38, redo lsn=0/5565E00


In order to fix this, the
PendingCheckpointerStats.buf_written_checkpoints should be incremented
in SlruInternalWritePage() similar to
CheckpointStats.ckpt_bufs_written. I have attached the patch for the
same. Please share your thoughts.


Thanks & Regards,
Nitin Jadhav

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Common function for percent placeholder replacement
Next
From: Amit Langote
Date:
Subject: Re: generic plans and "initial" pruning