[PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer - Mailing list pgsql-hackers
| From | BharatDB |
|---|---|
| Subject | [PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer |
| Date | |
| Msg-id | CAAh00EQCZCApECHO8jGixXGRqE=WKSsJo_yjNQODPTOy29FNYw@mail.gmail.com Whole thread Raw |
| List | pgsql-hackers |
Hi all,
While debugging checkpointer write behavior (related to the write-combining discussions)[1], I recently found some of the enhancements related to extending PostgreSQL's internal statistics . By including contextual information to checkpoint completion logs, when `log_checkpoints` is enabled we can now know the checkpoint type (manual, timed and immediate), checkpoint duration and checkpoint completion timestamp through SQL for better monitoring and analysis. I think it would be helpful to understand more clearly about the current state of a bug and for further review on future bugs. I had discussed the same with one of the contributors of PostgreSQL and as per the suggestion, I am submitting a small independent patch that adds two additional fields to the pg_stat_checkpointer log.
1. Patch features: I introduced these two fields internally as newly added statistics and was able to surface them through SQL. The added statistics were:
a. last_checkpoint_time : Exposes the completion timestamp of the most recent checkpoint and is useful for monitoring checkpoint frequency and diagnosing irregular checkpoint behavior.
b. checkpoint_total_time: Reports the total time spent in the last completed checkpoint and helps administrators detect unexpectedly long or stalled checkpoints.
Soumya
BharatDB
References:
[1] https://www.postgresql.org/message-id/flat/CAAKRu_ZBzTp-o4pu1UwmpLWkFmAmP7dyGFo867HxMo-AF%2B%3DMDw%40mail.gmail.com#854baf9be7d87a812a1007d0918f8ed8
While debugging checkpointer write behavior (related to the write-combining discussions)[1], I recently found some of the enhancements related to extending PostgreSQL's internal statistics . By including contextual information to checkpoint completion logs, when `log_checkpoints` is enabled we can now know the checkpoint type (manual, timed and immediate), checkpoint duration and checkpoint completion timestamp through SQL for better monitoring and analysis. I think it would be helpful to understand more clearly about the current state of a bug and for further review on future bugs. I had discussed the same with one of the contributors of PostgreSQL and as per the suggestion, I am submitting a small independent patch that adds two additional fields to the pg_stat_checkpointer log.
1. Patch features: I introduced these two fields internally as newly added statistics and was able to surface them through SQL. The added statistics were:
a. last_checkpoint_time : Exposes the completion timestamp of the most recent checkpoint and is useful for monitoring checkpoint frequency and diagnosing irregular checkpoint behavior.
b. checkpoint_total_time: Reports the total time spent in the last completed checkpoint and helps administrators detect unexpectedly long or stalled checkpoints.
I am attaching my observations, screenshots and patch in support for this.
Observations:
1. Terminal output for type of checkpoint occurred:
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ pgbench -i -s 50 -p 55432 -d postgres
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 7.49 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 5.52 s, vacuum 0.20 s, primary keys 1.77 s).
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ pgbench -c 8 -j 8 -T 120 -p 55432 -d postgres
pgbench (19devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 8
number of threads: 8
maximum number of tries: 1
duration: 120 s
number of transactions actually processed: 228099
number of failed transactions: 0 (0.000%)
latency average = 4.209 ms
initial connection time = 7.762 ms
tps = 1900.858503 (without initial connection time)
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.
postgres=# CHECKPOINT;
CHECKPOINT
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ sudo grep "checkpoint complete" "$PGDATA/logfile" | tail -1
[sudo] password for soumya:
2025-11-20 11:51:06.128 IST [18026] LOG: checkpoint complete (immediate): wrote 7286 buffers (44.5%), wrote 4 SLRU buffers; 0 WAL file(s) added, 0 removed, 27 recycled; write=0.095 s, sync=0.034 s, total=0.279 s; sync files=17, longest=0.004 s, average=0.002 s; distance=447382 kB, estimate=531349 kB; lsn=0/7F4EDED8, redo lsn=0/7F4EDE80
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$
2. Terminal output for type of checkpoint duration and checkpoint completion timestamp:
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.
postgres=# SELECT checkpoint_total_time, last_checkpoint_time FROM pg_stat_checkpointer;
checkpoint_total_time | last_checkpoint_time
-----------------------+----------------------------------
175119 | 2025-11-20 11:51:06.128918+05:30
(1 row)
postgres=# CHECKPOINT;
CHECKPOINT
postgres=# SELECT pg_sleep(1);
pg_sleep
----------
(1 row)
postgres=# SELECT checkpoint_total_time, last_checkpoint_time FROM pg_stat_checkpointer;
checkpoint_total_time | last_checkpoint_time
-----------------------+----------------------------------
175138 | 2025-11-20 11:58:02.879149+05:30
(1 row)
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ sudo grep "checkpoint complete" "$PGDATA/logfile" | tail -1
2025-11-20 11:58:02.879 IST [18026] LOG: checkpoint complete (immediate): wrote 0 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.019 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=478214 kB; lsn=0/7F4EDFE0, redo lsn=0/7F4EDF88
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$
Observations:
1. Terminal output for type of checkpoint occurred:
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ pgbench -i -s 50 -p 55432 -d postgres
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
vacuuming...
creating primary keys...
done in 7.49 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 5.52 s, vacuum 0.20 s, primary keys 1.77 s).
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ pgbench -c 8 -j 8 -T 120 -p 55432 -d postgres
pgbench (19devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 8
number of threads: 8
maximum number of tries: 1
duration: 120 s
number of transactions actually processed: 228099
number of failed transactions: 0 (0.000%)
latency average = 4.209 ms
initial connection time = 7.762 ms
tps = 1900.858503 (without initial connection time)
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.
postgres=# CHECKPOINT;
CHECKPOINT
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ sudo grep "checkpoint complete" "$PGDATA/logfile" | tail -1
[sudo] password for soumya:
2025-11-20 11:51:06.128 IST [18026] LOG: checkpoint complete (immediate): wrote 7286 buffers (44.5%), wrote 4 SLRU buffers; 0 WAL file(s) added, 0 removed, 27 recycled; write=0.095 s, sync=0.034 s, total=0.279 s; sync files=17, longest=0.004 s, average=0.002 s; distance=447382 kB, estimate=531349 kB; lsn=0/7F4EDED8, redo lsn=0/7F4EDE80
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$
2. Terminal output for type of checkpoint duration and checkpoint completion timestamp:
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ psql -p 55432 -d postgres
psql (19devel)
Type "help" for help.
postgres=# SELECT checkpoint_total_time, last_checkpoint_time FROM pg_stat_checkpointer;
checkpoint_total_time | last_checkpoint_time
-----------------------+----------------------------------
175119 | 2025-11-20 11:51:06.128918+05:30
(1 row)
postgres=# CHECKPOINT;
CHECKPOINT
postgres=# SELECT pg_sleep(1);
pg_sleep
----------
(1 row)
postgres=# SELECT checkpoint_total_time, last_checkpoint_time FROM pg_stat_checkpointer;
checkpoint_total_time | last_checkpoint_time
-----------------------+----------------------------------
175138 | 2025-11-20 11:58:02.879149+05:30
(1 row)
postgres=# \q
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$ sudo grep "checkpoint complete" "$PGDATA/logfile" | tail -1
2025-11-20 11:58:02.879 IST [18026] LOG: checkpoint complete (immediate): wrote 0 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.019 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=478214 kB; lsn=0/7F4EDFE0, redo lsn=0/7F4EDF88
soumya@soumya:~/SoumyaSM/BharatDB/my_postgres$
I hope this will be helpful for proceeding further. Looking forward to more feedback.Thanking you.Regards,Soumya
BharatDB
References:
[1] https://www.postgresql.org/message-id/flat/CAAKRu_ZBzTp-o4pu1UwmpLWkFmAmP7dyGFo867HxMo-AF%2B%3DMDw%40mail.gmail.com#854baf9be7d87a812a1007d0918f8ed8
Attachment
pgsql-hackers by date: