Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
Date | |
Msg-id | CALj2ACXDpgbVBt48D+2X4Hb7XvROMAg7EnbM-tHB+gPoHvEQew@mail.gmail.com Whole thread Raw |
In response to | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
|
List | pgsql-hackers |
On Sun, Feb 27, 2022 at 8:44 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Feb 25, 2022 at 8:38 PM Nitin Jadhav > <nitinjadhavpostgres@gmail.com> wrote: > > Had a quick look over the v3 patch. I'm not sure if it's the best way > to have pg_stat_get_progress_checkpoint_type, > pg_stat_get_progress_checkpoint_kind and > pg_stat_get_progress_checkpoint_start_time just for printing info in > readable format in pg_stat_progress_checkpoint. I don't think these > functions will ever be useful for the users. > > 1) Can't we use pg_is_in_recovery to determine if it's a restartpoint > or checkpoint instead of having a new function > pg_stat_get_progress_checkpoint_type? > > 2) Can't we just have these checks inside CASE-WHEN-THEN-ELSE blocks > directly instead of new function pg_stat_get_progress_checkpoint_kind? > + snprintf(ckpt_kind, MAXPGPATH, "%s%s%s%s%s%s%s%s%s", > + (flags == 0) ? "unknown" : "", > + (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "", > + (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "", > + (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "", > + (flags & CHECKPOINT_FORCE) ? "force " : "", > + (flags & CHECKPOINT_WAIT) ? "wait " : "", > + (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "", > + (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "", > + (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : ""); > > 3) Why do we need this extra calculation for start_lsn? Do you ever > see a negative LSN or something here? > + ('0/0'::pg_lsn + ( > + CASE > + WHEN (s.param3 < 0) THEN pow((2)::numeric, (64)::numeric) > + ELSE (0)::numeric > + END + (s.param3)::numeric)) AS start_lsn, > > 4) Can't you use timestamptz_in(to_char(s.param4)) instead of > pg_stat_get_progress_checkpoint_start_time? I don't quite understand > the reasoning for having this function and it's named as *checkpoint* > when it doesn't do anything specific to the checkpoint at all? > > Having 3 unnecessary functions that aren't useful to the users at all > in proc.dat will simply eatup the function oids IMO. Hence, I suggest > let's try to do without extra functions. Another thought for my review comment: > 1) Can't we use pg_is_in_recovery to determine if it's a restartpoint > or checkpoint instead of having a new function > pg_stat_get_progress_checkpoint_type? I don't think using pg_is_in_recovery work here as it is taken after the checkpoint has started. So, I think the right way here is to send 1 in CreateCheckPoint and 2 in CreateRestartPoint and use CASE-WHEN-ELSE-END to show "1": "checkpoint" "2":"restartpoint". Continuing my review: 5) Do we need a special phase for this checkpoint operation? I'm not sure in which cases it will take a long time, but it looks like there's a wait loop here. vxids = GetVirtualXIDsDelayingChkpt(&nvxids); if (nvxids > 0) { do { pg_usleep(10000L); /* wait for 10 msec */ } while (HaveVirtualXIDsDelayingChkpt(vxids, nvxids)); } Also, how about special phases for SyncPostCheckpoint(), SyncPreCheckpoint(), InvalidateObsoleteReplicationSlots(), PreallocXlogFiles() (it currently pre-allocates only 1 WAL file, but it might be increase in future (?)), TruncateSUBTRANS()? 6) SLRU (Simple LRU) isn't a phase here, you can just say PROGRESS_CHECKPOINT_PHASE_PREDICATE_LOCK_PAGES. + + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE, + PROGRESS_CHECKPOINT_PHASE_SLRU_PAGES); CheckPointPredicate(); And :s/checkpointing SLRU pages/checkpointing predicate lock pages + WHEN 9 THEN 'checkpointing SLRU pages' 7) :s/PROGRESS_CHECKPOINT_PHASE_FILE_SYNC/PROGRESS_CHECKPOINT_PHASE_PROCESS_FILE_SYNC_REQUESTS And :s/WHEN 11 THEN 'performing sync requests'/WHEN 11 THEN 'processing file sync requests' 8) :s/Finalizing/finalizing + WHEN 14 THEN 'Finalizing' 9) :s/checkpointing snapshots/checkpointing logical replication snapshot files + WHEN 3 THEN 'checkpointing snapshots' :s/checkpointing logical rewrite mappings/checkpointing logical replication rewrite mapping files + WHEN 4 THEN 'checkpointing logical rewrite mappings' 10) I'm not sure if it's discussed, how about adding the number of snapshot/mapping files so far the checkpoint has processed in file processing while loops of CheckPointSnapBuild/CheckPointLogicalRewriteHeap? Sometimes, there can be many logical snapshot or mapping files and users may be interested in knowing the so-far-processed-file-count. 11) I think it's discussed, are we going to add the pid of the checkpoint requestor? Regards, Bharath Rupireddy.
pgsql-hackers by date: