Magnus Hagander <magnus@hagander.net> writes: >> Therefore, reporting the checkpoint progress in the server logs, much >> like [1], seems to be the best way IMO.
> I find progress reporting in the logfile to generally be a terrible > way of doing things, and the fact that we do it for the startup > process is/should be only because we have no other choice, not because > it's the right choice.
I'm already pretty seriously unhappy about the log-spamming effects of 64da07c41 (default to log_checkpoints=on), and am willing to lay a side bet that that gets reverted after we have some field experience with it. This proposal seems far worse from that standpoint. Keep in mind that our out-of-the-box logging configuration still doesn't have any log rotation ability, which means that the noisier the server is in normal operation, the sooner you fill your disk.
Server is not open up for the queries while running the end of recovery checkpoint and a catalog view may not help here but the process title change or logging would be helpful in such cases. When the server is running the recovery, anxious customers ask several times the ETA for recovery completion, and not having visibility into these operations makes life difficult for the DBA/operations.
> I think the right choice to solve the *general* problem is the > mentioned pg_stat_progress_checkpoints.
+1
+1 to this. We need at least a trace of the number of buffers to sync (num_to_scan) before the checkpoint start, instead of just emitting the stats at the end.
Bharat, it would be good to show the buffers synced counter and the total buffers to sync, checkpointer pid, substep it is running, whether it is on target for completion, checkpoint_Reason (manual/times/forced). BufferSync has several variables tracking the sync progress locally, and we may need some refactoring here.