Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers
From | Nitin Jadhav |
---|---|
Subject | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
Date | |
Msg-id | CAMm1aWY_wThAxmhC_7mkFsYHcNP2_JQLAKJEoY5EcHBi0Q42QQ@mail.gmail.com Whole thread Raw |
In response to | Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
List | pgsql-hackers |
> > We need at least a trace of the number of buffers to sync (num_to_scan) before the checkpoint start, instead of justemitting the stats at the end. > > > > Bharat, it would be good to show the buffers synced counter and the total buffers to sync, checkpointer pid, substepit is running, whether it is on target for completion, checkpoint_Reason > > (manual/times/forced). BufferSync has several variables tracking the sync progress locally, and we may need some refactoringhere. > > I agree to provide above mentioned information as part of showing the > progress of current checkpoint operation. I am currently looking into > the code to know if any other information can be added. Here is the initial patch to show the progress of checkpoint through pg_stat_progress_checkpoint view. Please find the attachment. The information added to this view are pid - process ID of a CHECKPOINTER process, kind - kind of checkpoint indicates the reason for checkpoint (values can be wal, time or force), phase - indicates the current phase of checkpoint operation, total_buffer_writes - total number of buffers to be written, buffers_processed - number of buffers processed, buffers_written - number of buffers written, total_file_syncs - total number of files to be synced, files_synced - number of files synced. There are many operations happen as part of checkpoint. For each of the operation I am updating the phase field of pg_stat_progress_checkpoint view. The values supported for this field are initializing, checkpointing replication slots, checkpointing snapshots, checkpointing logical rewrite mappings, checkpointing CLOG pages, checkpointing CommitTs pages, checkpointing SUBTRANS pages, checkpointing MULTIXACT pages, checkpointing SLRU pages, checkpointing buffers, performing sync requests, performing two phase checkpoint, recycling old XLOG files and Finalizing. In case of checkpointing buffers phase, the fields total_buffer_writes, buffers_processed and buffers_written shows the detailed progress of writing buffers. In case of performing sync requests phase, the fields total_file_syncs and files_synced shows the detailed progress of syncing files. In other phases, only the phase field is getting updated and it is difficult to show the progress because we do not get the total number of files count without traversing the directory. It is not worth to calculate that as it affects the performance of the checkpoint. I also gave a thought to just mention the number of files processed, but this wont give a meaningful progress information (It can be treated as statistics). Hence just updating the phase field in those scenarios. Apart from above fields, I am planning to add few more fields to the view in the next patch. That is, process ID of the backend process which triggered a CHECKPOINT command, checkpoint start location, filed to indicate whether it is a checkpoint or restartpoint and elapsed time of the checkpoint operation. Please share your thoughts. I would be happy to add any other information that contributes to showing the progress of checkpoint. As per the discussion in this thread, there should be some mechanism to show the progress of checkpoint during shutdown and end-of-recovery cases as we cannot access pg_stat_progress_checkpoint in those cases. I am working on this to use log_startup_progress_interval mechanism to log the progress in the server logs. Kindly review the patch and share your thoughts. On Fri, Jan 28, 2022 at 12:24 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Jan 21, 2022 at 11:07 AM Nitin Jadhav > <nitinjadhavpostgres@gmail.com> wrote: > > > > > I think the right choice to solve the *general* problem is the > > > mentioned pg_stat_progress_checkpoints. > > > > > > We may want to *additionally* have the ability to log the progress > > > specifically for the special cases when we're not able to use that > > > view. And in those case, we can perhaps just use the existing > > > log_startup_progress_interval parameter for this as well -- at least > > > for the startup checkpoint. > > > > +1 > > > > > We need at least a trace of the number of buffers to sync (num_to_scan) before the checkpoint start, instead of justemitting the stats at the end. > > > > > > Bharat, it would be good to show the buffers synced counter and the total buffers to sync, checkpointer pid, substepit is running, whether it is on target for completion, checkpoint_Reason > > > (manual/times/forced). BufferSync has several variables tracking the sync progress locally, and we may need some refactoringhere. > > > > I agree to provide above mentioned information as part of showing the > > progress of current checkpoint operation. I am currently looking into > > the code to know if any other information can be added. > > As suggested in the other thread by Julien, I'm changing the subject > of this thread to reflect the discussion. > > Regards, > Bharath Rupireddy.
Attachment
pgsql-hackers by date: