Thread: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

07 December 2021, 14:36:22

Hi,

Currently one can know the kind of on-going/last checkpoint (shutdown,
end-of-recovery, immediate, force etc.) only via server logs that to
when log_checkpoints GUC is on. At times, the users/service layer
components would want to know the kind of checkpoint (along with other
checkpoint related info) to take some actions and it will be a bit
difficult to search through the server logs. The checkpoint info can
be obtained from the control file (either by pg_control_checkpoint()
or by pg_controldata tool) whereas checkpoint kind isn't available
there.

How about we add an extra string field to the control file alongside
the other checkpoint info it already has? This way, the existing
pg_control_checkpoint() or pg_controldata tool can easily be enhanced
to show the checkpoint kind as well. One concern is that we don't want
to increase the size of pg_controldata by more than the typical block
size (of 8K) to avoid any torn-writes. With this change, we might add
at max the strings specified at [1]. Adding it to the control file has
an advantage of preserving the last checkpoint kind which might be
useful.

Thoughts?

[1] for checkpoint: "checkpoint shutdown end-of-recovery immediate
force wait wal time flush-all"
for restartpoint: "restartpoint shutdown end-of-recovery immediate
force wait wal time flush-all"

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Tomas Vondra

Date:

07 December 2021, 22:18:37

On 12/7/21 15:36, Bharath Rupireddy wrote:
> Hi,
> 
> Currently one can know the kind of on-going/last checkpoint (shutdown,
> end-of-recovery, immediate, force etc.) only via server logs that to
> when log_checkpoints GUC is on. At times, the users/service layer
> components would want to know the kind of checkpoint (along with other
> checkpoint related info) to take some actions and it will be a bit
> difficult to search through the server logs. The checkpoint info can
> be obtained from the control file (either by pg_control_checkpoint()
> or by pg_controldata tool) whereas checkpoint kind isn't available
> there.
> 
> How about we add an extra string field to the control file alongside
> the other checkpoint info it already has? This way, the existing
> pg_control_checkpoint() or pg_controldata tool can easily be enhanced
> to show the checkpoint kind as well. One concern is that we don't want
> to increase the size of pg_controldata by more than the typical block
> size (of 8K) to avoid any torn-writes. With this change, we might add
> at max the strings specified at [1]. Adding it to the control file has
> an advantage of preserving the last checkpoint kind which might be
> useful.
> 
> Thoughts?
> 

I agree it might be useful to provide information about the nature of
the checkpoint, and perhaps even PID of the backend that triggered it
(although that may be tricky, if the backend terminates).

I'm not sure about adding it to control data, though. That doesn't seem
like a very good match for something that's mostly for monitoring.

We already have some checkpoint info in pg_stat_bgwriter, but that's
just aggregated data, not very convenient for info about the current
checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
various info about the current checkpoint?

> [1] for checkpoint: "checkpoint shutdown end-of-recovery immediate
> force wait wal time flush-all"
> for restartpoint: "restartpoint shutdown end-of-recovery immediate
> force wait wal time flush-all"
> 

I'd bet squashing all of this into a single string (not really a flag)
will just mean people will have to parse it, etc. Keeping individual
boolean flags (or even separate string fields) would be better, I guess.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

08 December 2021, 01:54:34

On Wed, Dec 8, 2021 at 3:48 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 12/7/21 15:36, Bharath Rupireddy wrote:
> > Hi,
> >
> > Currently one can know the kind of on-going/last checkpoint (shutdown,
> > end-of-recovery, immediate, force etc.) only via server logs that to
> > when log_checkpoints GUC is on. At times, the users/service layer
> > components would want to know the kind of checkpoint (along with other
> > checkpoint related info) to take some actions and it will be a bit
> > difficult to search through the server logs. The checkpoint info can
> > be obtained from the control file (either by pg_control_checkpoint()
> > or by pg_controldata tool) whereas checkpoint kind isn't available
> > there.
> >
> > How about we add an extra string field to the control file alongside
> > the other checkpoint info it already has? This way, the existing
> > pg_control_checkpoint() or pg_controldata tool can easily be enhanced
> > to show the checkpoint kind as well. One concern is that we don't want
> > to increase the size of pg_controldata by more than the typical block
> > size (of 8K) to avoid any torn-writes. With this change, we might add
> > at max the strings specified at [1]. Adding it to the control file has
> > an advantage of preserving the last checkpoint kind which might be
> > useful.
> >
> > Thoughts?
> >
>
> I agree it might be useful to provide information about the nature of
> the checkpoint, and perhaps even PID of the backend that triggered it
> (although that may be tricky, if the backend terminates).

Thanks. I agree to have pg_stat_progress_checkpoint and yes PID of the
triggered backend can possibly go there (we can mention in the
documentation that the backend that triggered the checkpoint can get
terminated).

> I'm not sure about adding it to control data, though. That doesn't seem
> like a very good match for something that's mostly for monitoring.

Having it in the control data file (along with the existing checkpoint
information) will be helpful to know what was the last checkpoint
information and we can use the existing pg_control_checkpoint function
or the tool to emit that info. I plan to add an int16 flag as
suggested by Justin in this thread and come up with a patch.

> We already have some checkpoint info in pg_stat_bgwriter, but that's
> just aggregated data, not very convenient for info about the current
> checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
> various info about the current checkpoint?

+1 to have pg_stat_progress_checkpoint view to know what's going on
with the current checkpoint.

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

08 December 2021, 01:54:41

On Wed, Dec 8, 2021 at 4:07 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > I'd bet squashing all of this into a single string (not really a flag)
> > will just mean people will have to parse it, etc. Keeping individual
> > boolean flags (or even separate string fields) would be better, I guess.
>
> Since the size of controldata is a concern, I suggest to add an int16 to
> populate with flags, which pg_controldata can parse for display.

+1. I will come up with a patch soon.

> Note that this other patch intends to add the timestamp and LSN of most recent
> recovery to controldata.
> https://commitfest.postgresql.org/36/3183/

Thanks. I will go through it separately.

> > We already have some checkpoint info in pg_stat_bgwriter, but that's
> > just aggregated data, not very convenient for info about the current
> > checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
> > various info about the current checkpoint?
>
> It could go into the pg_stat_checkpointer view, which would be the culmination
> of another patch (cc Melanie).
> https://commitfest.postgresql.org/36/3272/

+1 to have pg_stat_progress_checkpoint view. We have
CheckpointStatsData already. What we need is to make this structure
shared and add a little more info to represent the progress, so that
the other backends can access it. I think we can discuss this in a
separate thread to give it a fresh try rather than proposing this as a
part of another thread. I will spend some time on
pg_stat_progress_checkpoint proposal and try to come up with a
separate thread to discuss.

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Tomas Vondra

Date:

08 December 2021, 02:04:23


On 12/8/21 02:54, Bharath Rupireddy wrote:
> On Wed, Dec 8, 2021 at 3:48 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 12/7/21 15:36, Bharath Rupireddy wrote:
>>> Hi,
>>>
>>> Currently one can know the kind of on-going/last checkpoint (shutdown,
>>> end-of-recovery, immediate, force etc.) only via server logs that to
>>> when log_checkpoints GUC is on. At times, the users/service layer
>>> components would want to know the kind of checkpoint (along with other
>>> checkpoint related info) to take some actions and it will be a bit
>>> difficult to search through the server logs. The checkpoint info can
>>> be obtained from the control file (either by pg_control_checkpoint()
>>> or by pg_controldata tool) whereas checkpoint kind isn't available
>>> there.
>>>
>>> How about we add an extra string field to the control file alongside
>>> the other checkpoint info it already has? This way, the existing
>>> pg_control_checkpoint() or pg_controldata tool can easily be enhanced
>>> to show the checkpoint kind as well. One concern is that we don't want
>>> to increase the size of pg_controldata by more than the typical block
>>> size (of 8K) to avoid any torn-writes. With this change, we might add
>>> at max the strings specified at [1]. Adding it to the control file has
>>> an advantage of preserving the last checkpoint kind which might be
>>> useful.
>>>
>>> Thoughts?
>>>
>>
>> I agree it might be useful to provide information about the nature of
>> the checkpoint, and perhaps even PID of the backend that triggered it
>> (although that may be tricky, if the backend terminates).
> 
> Thanks. I agree to have pg_stat_progress_checkpoint and yes PID of the
> triggered backend can possibly go there (we can mention in the
> documentation that the backend that triggered the checkpoint can get
> terminated).
> 

My concern is someone might run something that requires a checkpoint, so
we start it and put the PID into the catalog. And then the person aborts
the command and starts doing something else. But that does not abort the
checkpoint, but the backend now runs something that doesn't requite
checkpoint, which is rather confusing.

>> I'm not sure about adding it to control data, though. That doesn't seem
>> like a very good match for something that's mostly for monitoring.
> 
> Having it in the control data file (along with the existing checkpoint
> information) will be helpful to know what was the last checkpoint
> information and we can use the existing pg_control_checkpoint function
> or the tool to emit that info. I plan to add an int16 flag as
> suggested by Justin in this thread and come up with a patch.
> 

OK, although I'm not sure it's all that useful (if we have that in some
sort of system view).

>> We already have some checkpoint info in pg_stat_bgwriter, but that's
>> just aggregated data, not very convenient for info about the current
>> checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
>> various info about the current checkpoint?
> 
> +1 to have pg_stat_progress_checkpoint view to know what's going on
> with the current checkpoint.
> 

Do you plan to add it to this patch, or should it be a separate patch?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

08 December 2021, 02:13:19

On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> >> I agree it might be useful to provide information about the nature of
> >> the checkpoint, and perhaps even PID of the backend that triggered it
> >> (although that may be tricky, if the backend terminates).
> >
> > Thanks. I agree to have pg_stat_progress_checkpoint and yes PID of the
> > triggered backend can possibly go there (we can mention in the
> > documentation that the backend that triggered the checkpoint can get
> > terminated).
> >
>
> My concern is someone might run something that requires a checkpoint, so
> we start it and put the PID into the catalog. And then the person aborts
> the command and starts doing something else. But that does not abort the
> checkpoint, but the backend now runs something that doesn't requite
> checkpoint, which is rather confusing.
>
> >> I'm not sure about adding it to control data, though. That doesn't seem
> >> like a very good match for something that's mostly for monitoring.
> >
> > Having it in the control data file (along with the existing checkpoint
> > information) will be helpful to know what was the last checkpoint
> > information and we can use the existing pg_control_checkpoint function
> > or the tool to emit that info. I plan to add an int16 flag as
> > suggested by Justin in this thread and come up with a patch.
> >
>
> OK, although I'm not sure it's all that useful (if we have that in some
> sort of system view).

If the server is down, the control file will help. Since we already
have the other checkpoint info in the control file, it's much more
useful and sensible to have this extra piece of missing information
(checkpoint kind) there.

> >> We already have some checkpoint info in pg_stat_bgwriter, but that's
> >> just aggregated data, not very convenient for info about the current
> >> checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
> >> various info about the current checkpoint?
> >
> > +1 to have pg_stat_progress_checkpoint view to know what's going on
> > with the current checkpoint.
> >
>
> Do you plan to add it to this patch, or should it be a separate patch?

No, I will put some more thoughts around it and start a new thread.

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Tomas Vondra

Date:

08 December 2021, 02:22:10


On 12/8/21 02:54, Bharath Rupireddy wrote:
> On Wed, Dec 8, 2021 at 4:07 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>>> I'd bet squashing all of this into a single string (not really a flag)
>>> will just mean people will have to parse it, etc. Keeping individual
>>> boolean flags (or even separate string fields) would be better, I guess.
>>
>> Since the size of controldata is a concern, I suggest to add an int16 to
>> populate with flags, which pg_controldata can parse for display.
> 
> +1. I will come up with a patch soon.
> 
>> Note that this other patch intends to add the timestamp and LSN of most recent
>> recovery to controldata.
>> https://commitfest.postgresql.org/36/3183/
> 
> Thanks. I will go through it separately.
> 
>>> We already have some checkpoint info in pg_stat_bgwriter, but that's
>>> just aggregated data, not very convenient for info about the current
>>> checkpoint. So maybe we should add pg_stat_progress_checkpoint, showing
>>> various info about the current checkpoint?
>>
>> It could go into the pg_stat_checkpointer view, which would be the culmination
>> of another patch (cc Melanie).
>> https://commitfest.postgresql.org/36/3272/
> 

I don't think the pg_stat_checkpointer would be a good match - that's
going to be an aggregated view of all past checkpoints, not a good
source info about the currently running one.

> +1 to have pg_stat_progress_checkpoint view. We have
> CheckpointStatsData already. What we need is to make this structure
> shared and add a little more info to represent the progress, so that
> the other backends can access it. I think we can discuss this in a
> separate thread to give it a fresh try rather than proposing this as a
> part of another thread. I will spend some time on
> pg_stat_progress_checkpoint proposal and try to come up with a
> separate thread to discuss.
> 

+1 to discuss it as part of this patch

I'm not sure whether the view should look at CheckpointStatsData, or do
the same thing as the other pg_stat_progress_* views - send the data to
stat collector, and read it from there.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

09 December 2021, 06:38:46

On Wed, Dec 8, 2021 at 7:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> > >> I agree it might be useful to provide information about the nature of
> > >> the checkpoint, and perhaps even PID of the backend that triggered it
> > >> (although that may be tricky, if the backend terminates).
> >
> > >> I'm not sure about adding it to control data, though. That doesn't seem
> > >> like a very good match for something that's mostly for monitoring.
> > >
> > > Having it in the control data file (along with the existing checkpoint
> > > information) will be helpful to know what was the last checkpoint
> > > information and we can use the existing pg_control_checkpoint function
> > > or the tool to emit that info. I plan to add an int16 flag as
> > > suggested by Justin in this thread and come up with a patch.
> > >
> > OK, although I'm not sure it's all that useful (if we have that in some
> > sort of system view).
>
> If the server is down, the control file will help. Since we already
> have the other checkpoint info in the control file, it's much more
> useful and sensible to have this extra piece of missing information
> (checkpoint kind) there.

Here's the patch that adds the last checkpoint kind to the control
file, displayed as an output in the pg_controldata tool and also
exposes it via the pg_control_checkpoint function.

I will analyze and work on the idea of pg_stat_progress_checkpoint and
try to post the patch here in this thread.

Regards,
Bharath Rupireddy.

Attachment

v1-0001-add-last-checkpoint-kind-to-pg_control-file.patch

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

27 December 2021, 14:12:10

On Thu, Dec 9, 2021 at 12:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 7:43 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> > > >> I agree it might be useful to provide information about the nature of
> > > >> the checkpoint, and perhaps even PID of the backend that triggered it
> > > >> (although that may be tricky, if the backend terminates).
> > >
> > > >> I'm not sure about adding it to control data, though. That doesn't seem
> > > >> like a very good match for something that's mostly for monitoring.
> > > >
> > > > Having it in the control data file (along with the existing checkpoint
> > > > information) will be helpful to know what was the last checkpoint
> > > > information and we can use the existing pg_control_checkpoint function
> > > > or the tool to emit that info. I plan to add an int16 flag as
> > > > suggested by Justin in this thread and come up with a patch.
> > > >
> > > OK, although I'm not sure it's all that useful (if we have that in some
> > > sort of system view).
> >
> > If the server is down, the control file will help. Since we already
> > have the other checkpoint info in the control file, it's much more
> > useful and sensible to have this extra piece of missing information
> > (checkpoint kind) there.
>
> Here's the patch that adds the last checkpoint kind to the control
> file, displayed as an output in the pg_controldata tool and also
> exposes it via the pg_control_checkpoint function.

Here's v2, rebased onto the latest master.

Regards,
Bharath Rupireddy.

Attachment

v2-0001-add-last-checkpoint-kind-to-pg_control-file.patch

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Sergey Dudoladov

Date:

27 January 2022, 09:53:32

Hi all,

> Here's v2, rebased onto the latest master.

I've reviewed this patch. The patch builds against the master (commit
e9d4001ec592bcc9a3332547cb1b0211e8794f38) and passes all the tests.
The patch does what it intends to do, namely store the kind of the
last checkpoint in the control file and display it in the output of
the pg_control_checkpoint() function and pg_controldata utility.
I did not test it with restartpoints though. Speaking of the torn
writes, the size of the control file with this patch applied does not
exceed 8Kb.

A few code comments:

+ char ckpt_kind[2 * MAXPGPATH];

I don't completely understand why 2 * MAXPGPATH is used here for the
buffer size.
[1] defines MAXPGPATH to be standard size of a pathname buffer. How
does it relate to ckpt_kind ?

+ ControlFile->checkPointKind = 0;

It is worth a comment that 0 is unknown, as for instance in [2]

+ (flags == 0) ? "unknown" : "",

That reads as if this patch would introduce a new "unknown" checkpoint state.
Why have it here at all if after for example initdb the kind is "shutdown" ?


+ snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
+ (flags == 0) ? "unknown" : "",
+ (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
+ (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "",
+ (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "",
+ (flags & CHECKPOINT_FORCE) ? "force " : "",
+ (flags & CHECKPOINT_WAIT) ? "wait " : "",
+ (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "",
+ (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "",
+ (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : "");

The space at the strings' end (as in "wait " or "immediate ")
introduces extra whitespace in the output of pg_control_checkpoint().
A similar check at [3] places whitespace differently; that arrangement
of whitespace should remove the issue.

[1]
https://github.com/postgres/postgres/blob/410aa248e5a883fde4832999cc9b23c7ace0f2ff/src/include/pg_config_manual.h#L106
[2]
https://github.com/postgres/postgres/blob/410aa248e5a883fde4832999cc9b23c7ace0f2ff/src/interfaces/libpq/fe-exec.c#L1078
[3]
https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/access/transam/xlog.c#L8851-L8859

Regards,
Sergey Dudoladov

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

27 January 2022, 10:56:57

Hi,

On Thu, Jan 27, 2022 at 10:53:32AM +0100, Sergey Dudoladov wrote:
> Hi all,
> 
> > Here's v2, rebased onto the latest master.
> 
> I've reviewed this patch. The patch builds against the master (commit
> e9d4001ec592bcc9a3332547cb1b0211e8794f38) and passes all the tests.
> The patch does what it intends to do, namely store the kind of the
> last checkpoint in the control file and display it in the output of
> the pg_control_checkpoint() function and pg_controldata utility.

I don't agree with that.

What it's showing is the "currently ongoing checkpoint or last completed
checkpoint" kind.  It's still not possible to know if a checkpoint is in
progress or not and any kind of information related to it, so I'm not sure how
useful this will be compared to a checkpoint progress view.

Also, it's only showing the initial triggering conditions of checkpoints.
For instance, if a timed checkpoint is started and then a backend executes a
"CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
AFAICS those new flags won't be saved to the control file.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

27 January 2022, 11:09:29

On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
> 
> What it's showing is the "currently ongoing checkpoint or last completed
> checkpoint" kind.

Ah after double checking I see it's storing the information *after* the
checkpoint completion, so it's indeed the last completed checkpoint.  I'm not
sure how useful it can be, but ok.

> Also, it's only showing the initial triggering conditions of checkpoints.
> For instance, if a timed checkpoint is started and then a backend executes a
> "CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
> AFAICS those new flags won't be saved to the control file.

This one is still valid I think, it's only storing the initial flags and not
the possibly upgraded one in shmem.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Kyotaro Horiguchi

Date:

28 January 2022, 01:38:53

At Thu, 27 Jan 2022 19:09:29 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
> > 
> > What it's showing is the "currently ongoing checkpoint or last completed
> > checkpoint" kind.
> 
> Ah after double checking I see it's storing the information *after* the
> checkpoint completion, so it's indeed the last completed checkpoint.  I'm not
> sure how useful it can be, but ok.

I don't see it useful (but don't oppose) to record checkpoint kind in
control file.  It is a kind of realtime noncritical info and in the
first place retrievable from server log if needed. And I'm skeptical
that it is needed such frequently.  Checkpoint kind is useful to check
max_wal_size's appropriateness if it is in a summarized form as in
pg_stat_bgwriter. On the other hand showing the same in a stats view
or the output of pg_control_checkpoint() is fine by me.

> > Also, it's only showing the initial triggering conditions of checkpoints.
> > For instance, if a timed checkpoint is started and then a backend executes a
> > "CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
> > AFAICS those new flags won't be saved to the control file.
> 
> This one is still valid I think, it's only storing the initial flags and not
> the possibly upgraded one in shmem.

Agreed.

I don't like to add this complex-but-need-in-sync blob twice. If we
need to do that twice, I want them consolidated in any shape.

>    Datum        values[18];
>    bool        nulls[18];

You forgot to expand these arrays.

This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
and pg_upgrade need to treat the change.

Even if we add checkpoint kind to control file, it would look a bit
strange that the "checkpoint kind" shows first among all
checkpoint-related lines.  And at least the "wait" in the line is
really useless since it is not a property of a checkpoint. Instead, it
doesn't show "requested" which is one of the checkpoint properties
like "xlog" and "time".  I'm not sure we need all of the properties to
be shown but I don't have a clear criteria for each property of it
ought to be shown or not.

> pg_control last modified:             Fri 28 Jan 2022 09:49:46 AM JST
> Latest checkpoint kind:               immediate force wait 
> Latest checkpoint location:           0/172B2C8

I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...

-  proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
+  proallargtypes =>
'{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',

I think the additional column should be text[] instead of text, but
not sure.

And you need to edit the corresponding part of the doc.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Kyotaro Horiguchi

Date:

28 January 2022, 01:41:28

Sorry, the last message lacks one citation.

At Thu, 27 Jan 2022 19:09:29 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
> > 
> > What it's showing is the "currently ongoing checkpoint or last completed
> > checkpoint" kind.
> 
> Ah after double checking I see it's storing the information *after* the
> checkpoint completion, so it's indeed the last completed checkpoint.  I'm not
> sure how useful it can be, but ok.

I don't see it useful (but don't oppose) to record checkpoint kind in
control file.  It is a kind of realtime noncritical info and in the
first place retrievable from server log if needed. And I'm skeptical
that it is needed such frequently.  Checkpoint kind is useful to check
max_wal_size's appropriateness if it is in a summarized form as in
pg_stat_bgwriter. On the other hand showing the same in a stats view
or the output of pg_control_checkpoint() is fine by me.

> > Also, it's only showing the initial triggering conditions of checkpoints.
> > For instance, if a timed checkpoint is started and then a backend executes a
> > "CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
> > AFAICS those new flags won't be saved to the control file.
> 
> This one is still valid I think, it's only storing the initial flags and not
> the possibly upgraded one in shmem.

Agreed.

+    snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
+                 (flags == 0) ? "unknown" : "",
+                 (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
+                 (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "",
+                 (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "",
+                 (flags & CHECKPOINT_FORCE) ? "force " : "",
+                 (flags & CHECKPOINT_WAIT) ? "wait " : "",
+                 (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "",
+                 (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "",
+                 (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : "");

I don't like to add this complex-but-need-in-sync blob twice. If we
need to do that twice, I want them consolidated in any shape.

+    Datum        values[18];
+    bool        nulls[18];

You forgot to expand these arrays.

This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
and pg_upgrade need to treat the change.

Even if we add checkpoint kind to control file, it would look a bit
strange that the "checkpoint kind" shows first among all
checkpoint-related lines.  And at least the "wait" in the line is
really useless since it is not a property of a checkpoint. Instead, it
doesn't show "requested" which is one of the checkpoint properties
like "xlog" and "time".  I'm not sure we need all of the properties to
be shown but I don't have a clear criteria for each property of it
ought to be shown or not.

> pg_control last modified:             Fri 28 Jan 2022 09:49:46 AM JST
> Latest checkpoint kind:               immediate force wait 
> Latest checkpoint location:           0/172B2C8

I'd like to see the PID of the triggering process, but it is really
not a information suitable in the control file...

-  proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
+  proallargtypes =>
'{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',

I think the additional column should be text[] instead of text, but
not sure.

And you need to edit the corresponding part of the doc.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

28 January 2022, 02:00:46

Hi,

On Fri, Jan 28, 2022 at 10:38:53AM +0900, Kyotaro Horiguchi wrote:
> 
> I'd like to see the PID of the triggering process, but it is really
> not a information suitable in the control file...

Yes that's something I would like too.  But even if the PIDs could be store, I
don't think that having the information for an already completed checkpoint
would be of any use at all.

For the current checkpoint, it should also be an array of PID.  For instance if
the checkpointer started a throttled checkpoint, then someone calls a non
immediate pg_start_backup() and finally thinks it's too slow and need a fast
checkpoint.  This would be welcome in a new pg_stat_progress_checkpoint view.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Kyotaro Horiguchi

Date:

28 January 2022, 02:10:36

At Fri, 28 Jan 2022 10:41:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Sorry, the last message lacks one citation.
> 
> At Thu, 27 Jan 2022 19:09:29 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> > On Thu, Jan 27, 2022 at 06:56:57PM +0800, Julien Rouhaud wrote:
> > > 
> > > What it's showing is the "currently ongoing checkpoint or last completed
> > > checkpoint" kind.
> > 
> > Ah after double checking I see it's storing the information *after* the
> > checkpoint completion, so it's indeed the last completed checkpoint.  I'm not
> > sure how useful it can be, but ok.
> 
> I don't see it useful (but don't oppose) to record checkpoint kind in
> control file.  It is a kind of realtime noncritical info and in the
> first place retrievable from server log if needed. And I'm skeptical
> that it is needed such frequently.  Checkpoint kind is useful to check
> max_wal_size's appropriateness if it is in a summarized form as in
> pg_stat_bgwriter. On the other hand showing the same in a stats view
> or the output of pg_control_checkpoint() is fine by me.
> 
> > > Also, it's only showing the initial triggering conditions of checkpoints.
> > > For instance, if a timed checkpoint is started and then a backend executes a
> > > "CHECKPOINT;", it will upgrade the ongoing checkpoint with additional flags but
> > > AFAICS those new flags won't be saved to the control file.
> > 
> > This one is still valid I think, it's only storing the initial flags and not
> > the possibly upgraded one in shmem.
> 
> Agreed.
> 
> +    snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
> +                 (flags == 0) ? "unknown" : "",
> +                 (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",
> +                 (flags & CHECKPOINT_END_OF_RECOVERY) ? "end-of-recovery " : "",
> +                 (flags & CHECKPOINT_IMMEDIATE) ? "immediate " : "",
> +                 (flags & CHECKPOINT_FORCE) ? "force " : "",
> +                 (flags & CHECKPOINT_WAIT) ? "wait " : "",
> +                 (flags & CHECKPOINT_CAUSE_XLOG) ? "wal " : "",
> +                 (flags & CHECKPOINT_CAUSE_TIME) ? "time " : "",
> +                 (flags & CHECKPOINT_FLUSH_ALL) ? "flush-all" : "");
> 
> 
> I don't like to add this complex-but-need-in-sync blob twice. If we
> need to do that twice, I want them consolidated in any shape.
> 
> +    Datum        values[18];
> +    bool        nulls[18];
> 
> You forgot to expand these arrays.
> 
> This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
> and pg_upgrade need to treat the change.
> 
> Even if we add checkpoint kind to control file, it would look a bit
> strange that the "checkpoint kind" shows first among all
> checkpoint-related lines.  And at least the "wait" in the line is
> really useless since it is not a property of a checkpoint. Instead, it
> doesn't show "requested" which is one of the checkpoint properties
> like "xlog" and "time".  I'm not sure we need all of the properties to
> be shown but I don't have a clear criteria for each property of it
> ought to be shown or not.
> 
> > pg_control last modified:             Fri 28 Jan 2022 09:49:46 AM JST
> > Latest checkpoint kind:               immediate force wait 
> > Latest checkpoint location:           0/172B2C8
> 
> I'd like to see the PID of the triggering process, but it is really
> not a information suitable in the control file...
> 
> 
> -  proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
> +  proallargtypes =>
'{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',
> 
> I think the additional column should be text[] instead of text, but
> not sure.
> 
> And you need to edit the corresponding part of the doc.

I have an additional comment.

+    char         ckpt_kind[2 * MAXPGPATH];
..
+    snprintf(ckpt_kind, 2 * MAXPGPATH, "%s%s%s%s%s%s%s%s%s",
+                 (flags == 0) ? "unknown" : "",
+                 (flags & CHECKPOINT_IS_SHUTDOWN) ? "shutdown " : "",


The value "2 * MAXPGPATH" is utterly nonsense about bouth "2" and
"MAXPGPATH", and the product of them is apparently too large.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Kyotaro Horiguchi

Date:

28 January 2022, 02:35:06

At Fri, 28 Jan 2022 10:00:46 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> Hi,
> 
> On Fri, Jan 28, 2022 at 10:38:53AM +0900, Kyotaro Horiguchi wrote:
> > 
> > I'd like to see the PID of the triggering process, but it is really
> > not a information suitable in the control file...
> 
> Yes that's something I would like too.  But even if the PIDs could be store, I
> don't think that having the information for an already completed checkpoint
> would be of any use at all.
> 
> For the current checkpoint, it should also be an array of PID.  For instance if
> the checkpointer started a throttled checkpoint, then someone calls a non
> immediate pg_start_backup() and finally thinks it's too slow and need a fast
> checkpoint.  This would be welcome in a new pg_stat_progress_checkpoint view.

Yeah, I thought of the same issue. But at the time of the previous
mail I thought it would be enough that the PID is of the first
initiator of the current running checkpoint likewise the checkpoint
kind.  So it is important to think about the use case.  If we put
significancy on actually happend checkpoints, we don't need absorbed
checkpoint requests to be recorded but if we put significance on
checkpoint requests, we need to care of every checkpoint request.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

28 January 2022, 02:47:39

On Fri, Jan 28, 2022 at 7:30 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Fri, Jan 28, 2022 at 10:38:53AM +0900, Kyotaro Horiguchi wrote:
> >
> > I'd like to see the PID of the triggering process, but it is really
> > not a information suitable in the control file...
>
> Yes that's something I would like too.  But even if the PIDs could be store, I
> don't think that having the information for an already completed checkpoint
> would be of any use at all.
>
> For the current checkpoint, it should also be an array of PID.  For instance if
> the checkpointer started a throttled checkpoint, then someone calls a non
> immediate pg_start_backup() and finally thinks it's too slow and need a fast
> checkpoint.  This would be welcome in a new pg_stat_progress_checkpoint view.

Thanks all for the comments. pg_stat_progress_checkpoint is being
discussed in another thread [1].

I will respond to the other comments soon.

[1] https://www.postgresql.org/message-id/CALj2ACV-F%2BK%2Bz%2BXW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ%40mail.gmail.com

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

28 January 2022, 03:24:13

Hi,

On Fri, Jan 28, 2022 at 08:17:39AM +0530, Bharath Rupireddy wrote:
> 
> Thanks all for the comments. pg_stat_progress_checkpoint is being
> discussed in another thread [1].
> 
> [1] https://www.postgresql.org/message-id/CALj2ACV-F%2BK%2Bz%2BXW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ%40mail.gmail.com

This thread was explicitly talking about log reporting and it's not clear to me
that it's now supposed to be about a new pg_stat_progress_checkpoint()
function.  I think you should start a new thread or at least change the subject
to be clearer.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

28 January 2022, 03:28:17

On Fri, Jan 28, 2022 at 8:54 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Fri, Jan 28, 2022 at 08:17:39AM +0530, Bharath Rupireddy wrote:
> >
> > Thanks all for the comments. pg_stat_progress_checkpoint is being
> > discussed in another thread [1].
> >
> > [1]
https://www.postgresql.org/message-id/CALj2ACV-F%2BK%2Bz%2BXW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ%40mail.gmail.com
>
> This thread was explicitly talking about log reporting and it's not clear to me
> that it's now supposed to be about a new pg_stat_progress_checkpoint()
> function.  I think you should start a new thread or at least change the subject
> to be clearer.

+1 to change the subject of that thread to 'report checkpoint progress
with pg_stat_progress_checkpoint'.

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

28 January 2022, 08:19:19

On Thu, Jan 27, 2022 at 3:23 PM Sergey Dudoladov
<sergey.dudoladov@gmail.com> wrote:
> > Here's v2, rebased onto the latest master.
>
> I've reviewed this patch. The patch builds against the master (commit
> e9d4001ec592bcc9a3332547cb1b0211e8794f38) and passes all the tests.
> The patch does what it intends to do, namely store the kind of the
> last checkpoint in the control file and display it in the output of
> the pg_control_checkpoint() function and pg_controldata utility.
> I did not test it with restartpoints though. Speaking of the torn
> writes, the size of the control file with this patch applied does not
> exceed 8Kb.

Thanks for the review.

> A few code comments:
>
> + char ckpt_kind[2 * MAXPGPATH];
>
> I don't completely understand why 2 * MAXPGPATH is used here for the
> buffer size.
> [1] defines MAXPGPATH to be standard size of a pathname buffer. How
> does it relate to ckpt_kind ?

I was using it loosely. Changed in the v3 patch.

> + ControlFile->checkPointKind = 0;
>
> It is worth a comment that 0 is unknown, as for instance in [2]

We don't even need ControlFile->checkPointKind = 0; because
InitControlFile will memset(ControlFile, 0, sizeof(ControlFileData));,
hence removed this.

> + (flags == 0) ? "unknown" : "",
>
> That reads as if this patch would introduce a new "unknown" checkpoint state.
> Why have it here at all if after for example initdb the kind is "shutdown" ?

Yeah, even LogCheckpointStart doesn't have anything "unknown" so removed it.

> The space at the strings' end (as in "wait " or "immediate ")
> introduces extra whitespace in the output of pg_control_checkpoint().
> A similar check at [3] places whitespace differently; that arrangement
> of whitespace should remove the issue.

Changed.

> >       Datum           values[18];
> >       bool            nulls[18];
>
> You forgot to expand these arrays.

Not sure what you meant here. The size of the array is already 19 in v2.

> This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
> and pg_upgrade need to treat the change.

I added a note in the commit message to bump cat version so that the
committer will take care of it.

> -  proallargtypes => '{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
> +  proallargtypes =>
'{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',
>
> I think the additional column should be text[] instead of text, but
> not sure.

We are preparing a single string of all the checkpoint kinds and
outputting as a text column, so we don't need text[].

Regards,
Bharath Rupireddy.

Attachment

v3-0001-add-last-checkpoint-kind-to-pg_control-file.patch

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

28 January 2022, 08:50:11

Hi,

On Fri, Jan 28, 2022 at 01:49:19PM +0530, Bharath Rupireddy wrote:
> On Thu, Jan 27, 2022 at 3:23 PM Sergey Dudoladov
> <sergey.dudoladov@gmail.com> wrote:
> > > Here's v2, rebased onto the latest master.
> >
> > I've reviewed this patch. The patch builds against the master (commit
> > e9d4001ec592bcc9a3332547cb1b0211e8794f38) and passes all the tests.
> > The patch does what it intends to do, namely store the kind of the
> > last checkpoint in the control file and display it in the output of
> > the pg_control_checkpoint() function and pg_controldata utility.
> > I did not test it with restartpoints though. Speaking of the torn
> > writes, the size of the control file with this patch applied does not
> > exceed 8Kb.
> 
> Thanks for the review.
> 
> > A few code comments:
> >
> > + char ckpt_kind[2 * MAXPGPATH];
> >
> > I don't completely understand why 2 * MAXPGPATH is used here for the
> > buffer size.
> > [1] defines MAXPGPATH to be standard size of a pathname buffer. How
> > does it relate to ckpt_kind ?
> 
> I was using it loosely. Changed in the v3 patch.
> 
> > + ControlFile->checkPointKind = 0;
> >
> > It is worth a comment that 0 is unknown, as for instance in [2]
> 
> We don't even need ControlFile->checkPointKind = 0; because
> InitControlFile will memset(ControlFile, 0, sizeof(ControlFileData));,
> hence removed this.
> 
> > + (flags == 0) ? "unknown" : "",
> >
> > That reads as if this patch would introduce a new "unknown" checkpoint state.
> > Why have it here at all if after for example initdb the kind is "shutdown" ?
> 
> Yeah, even LogCheckpointStart doesn't have anything "unknown" so removed it.
> 
> > The space at the strings' end (as in "wait " or "immediate ")
> > introduces extra whitespace in the output of pg_control_checkpoint().
> > A similar check at [3] places whitespace differently; that arrangement
> > of whitespace should remove the issue.
> 
> Changed.
> 
> > >       Datum           values[18];
> > >       bool            nulls[18];
> >
> > You forgot to expand these arrays.
> 
> Not sure what you meant here. The size of the array is already 19 in v2.
> 
> > This breaks checkpoint file format. Need to bump PG_CONTROL_VERSION,
> > and pg_upgrade need to treat the change.
> 
> I added a note in the commit message to bump cat version so that the
> committer will take care of it.

PG_CONTROL_VERSION is different from catversion.  You should update it in this
patch.

But Horiguchi-san was also mentioning that pg_upgrade/controldata.c needs some
modifications if you change the format (thus the requirement to bump
PG_CONTROL_VERSION).

Why are you defining CHECKPOINT_KIND_TEXT_LENGTH twice?  You
should just define it in some sensible header used by both files, or better
have a new function to take care of that rather than having the code
duplicated.

Also, you still didn't fix the possible flag upgrade issue.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

28 January 2022, 14:51:52

On Fri, Jan 28, 2022 at 2:20 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> PG_CONTROL_VERSION is different from catversion.  You should update it in this
> patch.

My bad. Updated it.

> But Horiguchi-san was also mentioning that pg_upgrade/controldata.c needs some
> modifications if you change the format (thus the requirement to bump
> PG_CONTROL_VERSION).

> Also, you still didn't fix the possible flag upgrade issue.

I don't think we need to change pg_upgrade's ControlData controldata;
structure as the information may not be needed there and the while
loop there specifically parses/searches for the required
pg_controldata output texts. Am I missing something here?

> Why are you defining CHECKPOINT_KIND_TEXT_LENGTH twice?  You
> should just define it in some sensible header used by both files, or better
> have a new function to take care of that rather than having the code
> duplicated.

Yeah, added the macro in pg_control.h. I also wanted to have a common
function to get checkpoint kind text and place it in
controldata_utils.c, but it doesn't have xlog.h included, so no
checkpoint flags there, hence I refrained from the common function
idea.

I think we don't need to print the checkpoint kind in pg_resetwal.c's
PrintControlValues because the pg_resetwal changes the checkpoint and
PrintControlValues just prints the fields that will not be
reset/changed by pg_resetwal. Am I missing something here?

Attaching v4.

Not related to this patch: by looking at the way the fields (like
"Latest checkpoint's TimeLineID:", "Latest checkpoint's NextOID:"
etc.) of pg_controldata output are being used in pg_resetwal.c,
pg_controldata.c, and pg_upgrade/controldata.c, I'm thinking of having
those fields as macros in pg_control.h
#define PG_CONTROL_LATEST_CHECKPOINT_TLI "Latest checkpoint's TimeLineID:"
#define PG_CONTROL_LATEST_CHECKPOINT_NEXTOID "Latest checkpoint's NextOID:"
and so on for all the pg_controldata fields would be a good idea for
better code manageability and not to miss any field text changes.

If okay, I will discuss this in a separate thread.

Regards,
Bharath Rupireddy.

Attachment

v4-0001-add-last-checkpoint-kind-to-pg_control-file.patch

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

28 January 2022, 16:10:19

Hi,

On Fri, Jan 28, 2022 at 08:21:52PM +0530, Bharath Rupireddy wrote:
> 
> I don't think we need to change pg_upgrade's ControlData controldata;
> structure as the information may not be needed there and the while
> loop there specifically parses/searches for the required
> pg_controldata output texts. Am I missing something here?

Right, I was remembering that there was a check that all expected fields were
found but after double checking I was clearly wrong, sorry about that.
> 
> > Also, you still didn't fix the possible flag upgrade issue.

Unless I'm missing something that's an issue that you still haven't addressed
or explained why it's not a problem?

> 
> > Why are you defining CHECKPOINT_KIND_TEXT_LENGTH twice?  You
> > should just define it in some sensible header used by both files, or better
> > have a new function to take care of that rather than having the code
> > duplicated.
> 
> Yeah, added the macro in pg_control.h. I also wanted to have a common
> function to get checkpoint kind text and place it in
> controldata_utils.c, but it doesn't have xlog.h included, so no
> checkpoint flags there, hence I refrained from the common function
> idea.

That's a bit annoying, I'm not sure what's best to do here.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Andres Freund

Date:

28 January 2022, 23:37:21

Hi,

On 2021-12-08 03:04:23 +0100, Tomas Vondra wrote:
> On 12/8/21 02:54, Bharath Rupireddy wrote:
> >> I'm not sure about adding it to control data, though. That doesn't seem
> >> like a very good match for something that's mostly for monitoring.
> > 
> > Having it in the control data file (along with the existing checkpoint
> > information) will be helpful to know what was the last checkpoint
> > information and we can use the existing pg_control_checkpoint function
> > or the tool to emit that info. I plan to add an int16 flag as
> > suggested by Justin in this thread and come up with a patch.
> > 
> 
> OK, although I'm not sure it's all that useful (if we have that in some
> sort of system view).

I don't think we should add stuff like this to the control file. We want to
keep ControlFileData within a disk sector size / 512 bytes (so that we don't
need to care about torn writes etc). Adding information that we don't really
need will byte us at some point, because at that point there will be reliance
on the added data.

Nor have I read a convincing reason for needing the data to be readily
accessible when the server is shut down?

Greetings,

Andres Freund

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Andres Freund

Date:

28 January 2022, 23:39:28

Hi,

On 2021-12-07 20:06:22 +0530, Bharath Rupireddy wrote:
> One concern is that we don't want to increase the size of pg_controldata by
> more than the typical block size (of 8K) to avoid any torn-writes.

The limit is 512 bytes (a disk sector), not 8K. There are plenty devices with
4K sectors as well, but I'm not aware of any with 8K.

Greetings,

Andres Freund

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Kyotaro Horiguchi

Date:

31 January 2022, 01:50:45

At Fri, 28 Jan 2022 13:49:19 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> > -  proallargtypes =>
'{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz}',
> > +  proallargtypes =>
'{pg_lsn,pg_lsn,text,int4,int4,bool,text,oid,xid,xid,xid,oid,xid,xid,oid,xid,xid,timestamptz,text}',
> >
> > I think the additional column should be text[] instead of text, but
> > not sure.
> 
> We are preparing a single string of all the checkpoint kinds and
> outputting as a text column, so we don't need text[].

What I meant to suggest is to change it to an array of text instead of
a text that consists of multiple labels concatenated by spaces.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Kyotaro Horiguchi

Date:

31 January 2022, 02:10:45

At Sat, 29 Jan 2022 00:10:19 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> On Fri, Jan 28, 2022 at 08:21:52PM +0530, Bharath Rupireddy wrote:
> > > Also, you still didn't fix the possible flag upgrade issue.
> 
> Unless I'm missing something that's an issue that you still haven't addressed
> or explained why it's not a problem?

As Bharath said, pg_upgreade reads a part of old cluster's controldata
that is needed to check compatibility so I agree that we don't have
flag upgrade issue I mentioned.

> > > Why are you defining CHECKPOINT_KIND_TEXT_LENGTH twice?  You
> > > should just define it in some sensible header used by both files, or better
> > > have a new function to take care of that rather than having the code
> > > duplicated.
> > 
> > Yeah, added the macro in pg_control.h. I also wanted to have a common
> > function to get checkpoint kind text and place it in
> > controldata_utils.c, but it doesn't have xlog.h included, so no
> > checkpoint flags there, hence I refrained from the common function
> > idea.
> 
> That's a bit annoying, I'm not sure what's best to do here.

I misunderstood that the size is restricted to 8k but it is really 512
bytes as Andres pointed.  So we don't add it at least as a text.  This
means pg_controldata need to translate the flags into human-readable
text but, to be clear, I still don't think its usefull in the control
data.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

31 January 2022, 05:28:31

On Mon, Jan 31, 2022 at 9:10 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Mon, Jan 31, 2022 at 11:10:45AM +0900, Kyotaro Horiguchi wrote:
> >
> > This means pg_controldata need to translate the flags into human-readable
> > text but, to be clear, I still don't think its usefull in the control
> > data.
>
> I've been saying that since my first email, I also don't see any scenario where
> having this specific information can be of any help.
>
> Given Andres feedback, unless someone can provide some realistic use case I
> think we should mark this patch as rejected and focus on a
> pg_progress_checkpoint feature.  I will do so in a couple of days when I will
> close this commitfest.

The size of ControlFileData is 296 bytes currently and the sector
limit is of 512 bytes (PG_CONTROL_MAX_SAFE_SIZE), if we feel that this
extra 2 bytes of checkpoint flags isn't worth storing in the control
file, I'm pretty much okay with it.

Regards,
Bharath Rupireddy.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Julien Rouhaud

Date:

31 January 2022, 05:54:16

Hi,

On Mon, Jan 31, 2022 at 10:58:31AM +0530, Bharath Rupireddy wrote:
> 
> The size of ControlFileData is 296 bytes currently and the sector
> limit is of 512 bytes (PG_CONTROL_MAX_SAFE_SIZE), if we feel that this
> extra 2 bytes of checkpoint flags isn't worth storing in the control
> file, I'm pretty much okay with it.

For now we have some room, but we will likely keep consuming bytes in that file
for more critical features and it's almost certain that at one point we will
regret wasting 2 bytes for that.

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Michael Paquier

Date:

31 January 2022, 07:17:12

On Mon, Jan 31, 2022 at 01:54:16PM +0800, Julien Rouhaud wrote:
> For now we have some room, but we will likely keep consuming bytes in that file
> for more critical features and it's almost certain that at one point we will
> regret wasting 2 bytes for that.

Agreed to drop the patch.  That's only two bytes, but we may regret
that in the future and this is not critical for the system.
--
Michael

Attachment

signature.asc

Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

From

Bharath Rupireddy

Date:

31 January 2022, 07:41:57

On Mon, Jan 31, 2022 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Jan 31, 2022 at 01:54:16PM +0800, Julien Rouhaud wrote:
> > For now we have some room, but we will likely keep consuming bytes in that file
> > for more critical features and it's almost certain that at one point we will
> > regret wasting 2 bytes for that.
>
> Agreed to drop the patch.  That's only two bytes, but we may regret
> that in the future and this is not critical for the system.

I agree. Thank you all for your review.