Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date
Msg-id 20220218074351.wgfosjt5lql2vjbr@jrouhaud
Whole thread Raw
In response to Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)  (Nitin Jadhav <nitinjadhavpostgres@gmail.com>)
Responses Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
List pgsql-hackers
Hi,

On Fri, Feb 18, 2022 at 12:20:26PM +0530, Nitin Jadhav wrote:
> >
> > If there's a checkpoint timed triggered and then someone calls
> > pg_start_backup() which then wait for the end of the current checkpoint
> > (possibly after changing the flags), I think the view should reflect that in
> > some way.  Maybe storing an array of (pid, flags) is too much, but at least a
> > counter with the number of processes actively waiting for the end of the
> > checkpoint.
> 
> Okay. I feel this can be added as additional field but it will not
> replace backend_pid field as this represents the pid of the backend
> which triggered the current checkpoint.

I don't think that's true.  Requesting a checkpoint means telling the
checkpointer that it should wake up and start a checkpoint (or restore point)
if it's not already doing so, so the pid will always be the checkpointer pid.
The only exception is a standalone backend, but in that case you won't be able
to query that view anyway.

And also while looking at the patch I see there's the same problem that I
mentioned in the previous thread, which is that the effective flags can be
updated once the checkpoint started, and as-is the view won't reflect that.  It
also means that you can't simply display one of wal, time or force but a
possible combination of the flags (including the one not handled in v1).

> Probably a new field named 'processes_wiating' or 'events_waiting' can be
> added for this purpose.

Maybe num_process_waiting?

> > > > > 'checkpoint or restartpoint?'
> > > >
> > > > Do you actually need to store that?  Can't it be inferred from
> > > > pg_is_in_recovery()?
> > >
> > > AFAIK we cannot use pg_is_in_recovery() to predict whether it is a
> > > checkpoint or restartpoint because if the system exits from recovery
> > > mode during restartpoint then any query to pg_stat_progress_checkpoint
> > > view will return it as a checkpoint which is ideally not correct. Please
> > > correct me if I am wrong.
> >
> > Recovery ends with an end-of-recovery checkpoint that has to finish before the
> > promotion can happen, so I don't think that a restart can still be in progress
> > if pg_is_in_recovery() returns false.
> 
> Probably writing of buffers or syncing files may complete before
> pg_is_in_recovery() returns false. But there are some cleanup
> operations happen as part of the checkpoint. During this scenario, we
> may get false value for pg_is_in_recovery(). Please refer following
> piece of code which is present in CreateRestartpoint().
> 
> if (!RecoveryInProgress())
>         replayTLI = XLogCtl->InsertTimeLineID;

Then maybe we could store the timeline rather then then kind of checkpoint?
You should still be able to compute the information while giving a bit more
information for the same memory usage.



pgsql-hackers by date:

Previous
From: Mikael Kjellström
Date:
Subject: Re: Time to drop plpython2?
Next
From: Michael Paquier
Date:
Subject: Re: improve --with-lz4 install documentation