Thread: Remove useless arguments in ReadCheckpointRecord().

Remove useless arguments in ReadCheckpointRecord().

From
Fujii Masao
Date:
Hi,

I'd like to propose to remove "whichChkpt" and "report" arguments in ReadCheckpointRecord(). "report" is obviously
uselessbecause it's always true, i.e., there are two callers of the function and they always specify true as "report".
 

"whichChkpt" indicates where the specified checkpoint location came from, pg_control or backup_label. This information
isused to log different messages as follows.
 

        switch (whichChkpt)
        {
            case 1:
                ereport(LOG,
                        (errmsg("invalid primary checkpoint link in control file")));
                break;
            default:
                ereport(LOG,
                        (errmsg("invalid checkpoint link in backup_label file")));
                break;
        }
        return NULL;
        ...
        switch (whichChkpt)
        {
            case 1:
                ereport(LOG,
                        (errmsg("invalid primary checkpoint record")));
                break;
            default:
                ereport(LOG,
                        (errmsg("invalid checkpoint record")));
                break;
        }
        return NULL;
        ...

But the callers of ReadCheckpointRecord() already output different log messages depending on where the invalid
checkpointrecord came from. So even if ReadCheckpointRecord() doesn't use "whichChkpt", i.e., use the same log message
inboth pg_control and backup_label cases, users can still identify where the invalid checkpoint record came from, by
readingthe log message.
 

Also when whichChkpt = 0, "primary checkpoint" is used in the log message and sounds confusing because, as far as I
recallcorrectly, we removed the concept of primary and secondary checkpoints before.
 

  Therefore I think that it's better to remove "whichChkpt" argument in ReadCheckpointRecord().

Patch attached. Thought?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachment

Re: Remove useless arguments in ReadCheckpointRecord().

From
Bharath Rupireddy
Date:
On Wed, Jul 20, 2022 at 8:21 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> Hi,
>
> I'd like to propose to remove "whichChkpt" and "report" arguments in ReadCheckpointRecord(). "report" is obviously
uselessbecause it's always true, i.e., there are two callers of the function and they always specify true as "report". 

Yes, the report parameter is obvious to delete. The commit 1d919de5eb
removed the only call with the report parameter as false.

> "whichChkpt" indicates where the specified checkpoint location came from, pg_control or backup_label. This
informationis used to log different messages as follows. 
>
>                 switch (whichChkpt)
>                 {
>                         case 1:
>                                 ereport(LOG,
>                                                 (errmsg("invalid primary checkpoint link in control file")));
>                                 break;
>                         default:
>                                 ereport(LOG,
>                                                 (errmsg("invalid checkpoint link in backup_label file")));
>                                 break;
>                 }
>                 return NULL;
>                 ...
>                 switch (whichChkpt)
>                 {
>                         case 1:
>                                 ereport(LOG,
>                                                 (errmsg("invalid primary checkpoint record")));
>                                 break;
>                         default:
>                                 ereport(LOG,
>                                                 (errmsg("invalid checkpoint record")));
>                                 break;
>                 }
>                 return NULL;
>                 ...
>
> But the callers of ReadCheckpointRecord() already output different log messages depending on where the invalid
checkpointrecord came from. So even if ReadCheckpointRecord() doesn't use "whichChkpt", i.e., use the same log message
inboth pg_control and backup_label cases, users can still identify where the invalid checkpoint record came from, by
readingthe log message. 
>
> Also when whichChkpt = 0, "primary checkpoint" is used in the log message and sounds confusing because, as far as I
recallcorrectly, we removed the concept of primary and secondary checkpoints before. 

Yes, using "primary checkpoint" confuses, as we usually refer primary
in the context of replication and HA.

>   Therefore I think that it's better to remove "whichChkpt" argument in ReadCheckpointRecord().
>
> Patch attached. Thought?

How about we transform the following messages into something like below?

(errmsg("could not locate a valid checkpoint record"))); after
ReadCheckpointRecord() for control file cases to "could not locate
valid checkpoint record in control file"
(errmsg("could not locate required checkpoint record"), after
ReadCheckpointRecord() for backup_label case to "could not locate
valid checkpoint record in backup_label file"

The above messages give more meaningful and unique info to the users.

May be unrelated, IIRC, for the errors like ereport(PANIC,
(errmsg("could not locate a valid checkpoint record"))); we wanted to
add a hint asking users to consider running pg_resetwal to fix the
issue. The error for ReadCheckpointRecord() in backup_label file
cases, already gives a hint errhint("If you are restoring from a
backup, touch \"%s/recovery.signal\" ...... Adding the hint of running
pg_resetwal (of course with a caution that it can cause inconsistency
in the data and use it as a last resort as described in the docs)
helps users and support engineers a lot to mitigate the server down
cases quickly.

Regards,
Bharath Rupireddy.



Re: Remove useless arguments in ReadCheckpointRecord().

From
Fujii Masao
Date:

On 2022/07/21 0:29, Bharath Rupireddy wrote:
> How about we transform the following messages into something like below?
> 
> (errmsg("could not locate a valid checkpoint record"))); after
> ReadCheckpointRecord() for control file cases to "could not locate
> valid checkpoint record in control file"
> (errmsg("could not locate required checkpoint record"), after
> ReadCheckpointRecord() for backup_label case to "could not locate
> valid checkpoint record in backup_label file"
> 
> The above messages give more meaningful and unique info to the users.

Agreed. Attached is the updated version of the patch.
Thanks for the review!


> May be unrelated, IIRC, for the errors like ereport(PANIC,
> (errmsg("could not locate a valid checkpoint record"))); we wanted to
> add a hint asking users to consider running pg_resetwal to fix the
> issue. The error for ReadCheckpointRecord() in backup_label file
> cases, already gives a hint errhint("If you are restoring from a
> backup, touch \"%s/recovery.signal\" ...... Adding the hint of running
> pg_resetwal (of course with a caution that it can cause inconsistency
> in the data and use it as a last resort as described in the docs)
> helps users and support engineers a lot to mitigate the server down
> cases quickly.

+1 to add some hint messages. But I'm not sure if it's good to hint the use of pg_resetwal because, as you're saying,
itshould be used as a last resort and has some big risks like data loss, corruption, etc. That is, I'm concerned about
thatsome users might quickly run pg_resetwal because hint message says that, without reading the docs nor understanding
thoserisks.
 

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachment

Re: Remove useless arguments in ReadCheckpointRecord().

From
Kyotaro Horiguchi
Date:
I agree to removing the two parameters. And agree to let
ReadCheckpointRecord not conscious of the location source.

At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> Agreed. Attached is the updated version of the patch.
> Thanks for the review!

-    (errmsg("could not locate required checkpoint record"),
+    (errmsg("could not locate a valid checkpoint record in backup_label file"),

"in backup_label" there looks *to me* need some verb..  By the way,
this looks like a good chance to remove the (now) extra parens around
errmsg() and friends.

For example:

-    (errmsg("could not locate a valid checkpoint record in backup_label file"),
+    errmsg("could not locate checkpoint record spcified in backup_label file"),

-    (errmsg("could not locate a valid checkpoint record in control file")));
+    errmsg("could not locate checkpoint record recorded in control file")));


+                (errmsg("invalid checkpoint record")));

Is it useful to show the specified LSN there?

+                (errmsg("invalid resource manager ID in checkpoint record")));

We have a similar message "invalid resource manager ID %u at
%X/%X". Since the caller explains that it is a checkpoint record, we
can share the message here.

+                (errmsg("invalid xl_info in checkpoint record")));

(It is not an issue of this patch, though) I don't think this is
appropriate for user-facing message. Counldn't we say "unexpected
record type: %x" or something like?

+                (errmsg("invalid length of checkpoint record")));

We have "record with invalid length at %X/%X" or "invalid record
length at %X/%X: wanted %u, got %u". Counld we reuse any of them?

> > May be unrelated, IIRC, for the errors like ereport(PANIC,
> > (errmsg("could not locate a valid checkpoint record"))); we wanted to
> > add a hint asking users to consider running pg_resetwal to fix the
> > issue. The error for ReadCheckpointRecord() in backup_label file
> > cases, already gives a hint errhint("If you are restoring from a
> > backup, touch \"%s/recovery.signal\" ...... Adding the hint of running
> > pg_resetwal (of course with a caution that it can cause inconsistency
> > in the data and use it as a last resort as described in the docs)
> > helps users and support engineers a lot to mitigate the server down
> > cases quickly.
> 
> +1 to add some hint messages. But I'm not sure if it's good to hint
> the use of pg_resetwal because, as you're saying, it should be used as
> a last resort and has some big risks like data loss, corruption,
> etc. That is, I'm concerned about that some users might quickly run
> pg_resetwal because hint message says that, without reading the docs
> nor understanding those risks.

I don't think we want to recommend pg_resetwal to those who don't
reach it by themselves by other means.  I know of a few instances of
people who had the tool (unrecoverably) break their own clusters.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Remove useless arguments in ReadCheckpointRecord().

From
Fujii Masao
Date:

On 2022/07/21 14:54, Kyotaro Horiguchi wrote:
> I agree to removing the two parameters. And agree to let
> ReadCheckpointRecord not conscious of the location source.

Thanks for the review!


> At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Agreed. Attached is the updated version of the patch.
>> Thanks for the review!
> 
> -    (errmsg("could not locate required checkpoint record"),
> +    (errmsg("could not locate a valid checkpoint record in backup_label file"),
> 
> "in backup_label" there looks *to me* need some verb..  

Sorry, I failed to understand your point. Could you clarify your point?


> By the way,
> this looks like a good chance to remove the (now) extra parens around
> errmsg() and friends.
> 
> For example:
> 
> -    (errmsg("could not locate a valid checkpoint record in backup_label file"),
> +    errmsg("could not locate checkpoint record spcified in backup_label file"),
> 
> -    (errmsg("could not locate a valid checkpoint record in control file")));
> +    errmsg("could not locate checkpoint record recorded in control file")));

Because it's recommended not to put parenthesis just before errmsg(), you mean? I'm ok to remove such parenthesis, but
I'dlike understand why before doing that. I was thinking that either having or not having parenthesis in front of
errmsg()is ok, so there are many calls to errmsg() with parenthesis, in xlogrecovery.c.
 


> +                (errmsg("invalid checkpoint record")));
> 
> Is it useful to show the specified LSN there?

Yes, LSN info would be helpful also for debugging.

I separated the patch into two; one is to remove useless arguments in ReadCheckpointRecord(), another is to improve log
messages.I added LSN info in log messages in the second patch.
 


> +                (errmsg("invalid resource manager ID in checkpoint record")));
> 
> We have a similar message "invalid resource manager ID %u at
> %X/%X". Since the caller explains that it is a checkpoint record, we
> can share the message here.

+1


> +                (errmsg("invalid xl_info in checkpoint record")));
> 
> (It is not an issue of this patch, though) I don't think this is
> appropriate for user-facing message. Counldn't we say "unexpected
> record type: %x" or something like?

The proposed log message doesn't look like an improvement. But I have no better one. So I left the message as it is, in
thepatch, for now.
 

> 
> +                (errmsg("invalid length of checkpoint record")));
> 
> We have "record with invalid length at %X/%X" or "invalid record
> length at %X/%X: wanted %u, got %u". Counld we reuse any of them?

+1

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachment

Re: Remove useless arguments in ReadCheckpointRecord().

From
Tom Lane
Date:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
> On 2022/07/21 14:54, Kyotaro Horiguchi wrote:
>> At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>> -    (errmsg("could not locate required checkpoint record"),
>>> +    (errmsg("could not locate a valid checkpoint record in backup_label file"),

>> "in backup_label" there looks *to me* need some verb..

> Sorry, I failed to understand your point. Could you clarify your point?

FWIW, the proposed change looks like perfectly good English to me.
"locate" is the verb.  It's been way too many years since high
school grammar for me to remember the exact term for auxiliary
clauses like "in backup_label file", but that doesn't need its
own verb.  Possibly Kyotaro-san is feeling that it should be
like "... checkpoint record in the backup_label file".  That'd
be more formal, but in the telegraphic style that we prefer for
primary error messages, omitting the "the" is fine.

            regards, tom lane



Re: Remove useless arguments in ReadCheckpointRecord().

From
Kyotaro Horiguchi
Date:
At Thu, 21 Jul 2022 23:10:04 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Fujii Masao <masao.fujii@oss.nttdata.com> writes:
> > On 2022/07/21 14:54, Kyotaro Horiguchi wrote:
> >> At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> >>> -    (errmsg("could not locate required checkpoint record"),
> >>> +    (errmsg("could not locate a valid checkpoint record in backup_label file"),
> 
> >> "in backup_label" there looks *to me* need some verb..  
> 
> > Sorry, I failed to understand your point. Could you clarify your point?
> 
> FWIW, the proposed change looks like perfectly good English to me.
> "locate" is the verb.  It's been way too many years since high
> school grammar for me to remember the exact term for auxiliary
> clauses like "in backup_label file", but that doesn't need its
> own verb.  Possibly Kyotaro-san is feeling that it should be
> like "... checkpoint record in the backup_label file".  That'd
> be more formal, but in the telegraphic style that we prefer for
> primary error messages, omitting the "the" is fine.

Maybe a little different.  I thought that a checkpoint record cannot
be located in backup_label file.  In other words what is in
backup_label file is a pointer to the record and the record is in a
WAL file.

I'm fine with the proposed sentsnse if the it makes the correct
sense. (And sorry for the noise)

By the way, I learned that the style is called "telegraphic style".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Remove useless arguments in ReadCheckpointRecord().

From
Kyotaro Horiguchi
Date:
At Fri, 22 Jul 2022 11:50:14 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> Sorry, I failed to understand your point. Could you clarify your
> point?

Wrote as a reply to Tom's message.

> > By the way,
> > this looks like a good chance to remove the (now) extra parens around
> > errmsg() and friends.
> > For example:
> > -    (errmsg("could not locate a valid checkpoint record in backup_label
> > -    file"),
> > + errmsg("could not locate checkpoint record spcified in backup_label
> > file"),
> > -    (errmsg("could not locate a valid checkpoint record in control file")));
> > + errmsg("could not locate checkpoint record recorded in control
> > file")));
> 
> Because it's recommended not to put parenthesis just before errmsg(),
> you mean? I'm ok to remove such parenthesis, but I'd like understand
> why before doing that. I was thinking that either having or not having
> parenthesis in front of errmsg() is ok, so there are many calls to
> errmsg() with parenthesis, in xlogrecovery.c.

I believed that it is recommended to move to the style not having the
outmost parens.  That style has been introduced by e3a87b4991.

>    * The extra parentheses around the auxiliary function calls are now
>    optional.  Aside from being a bit less ugly, this removes a common
>    gotcha for new contributors, because in some cases the compiler errors
>    you got from forgetting them were unintelligible.
...
>    While new code can be written either way, code intended to be
>    back-patched will need to use extra parens for awhile yet.  It seems
>    worth back-patching this change into v12, so as to reduce the window
>    where we have to be careful about that by one year.  Hence, this patch
>    is careful to preserve ABI compatibility; a followup HEAD-only patch
>    will make some additional simplifications.

So I thought that if we modify an error message, its ererpot can be
rewritten.


> > +                (errmsg("invalid checkpoint record")));
> > Is it useful to show the specified LSN there?
> 
> Yes, LSN info would be helpful also for debugging.
> 
> I separated the patch into two; one is to remove useless arguments in
> ReadCheckpointRecord(), another is to improve log messages. I added
> LSN info in log messages in the second patch.

Thanks!

> > +                (errmsg("invalid xl_info in checkpoint record")));
> > (It is not an issue of this patch, though) I don't think this is
> > appropriate for user-facing message. Counldn't we say "unexpected
> > record type: %x" or something like?
> 
> The proposed log message doesn't look like an improvement. But I have
> no better one. So I left the message as it is, in the patch, for now.

Understood.

regards
-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Remove useless arguments in ReadCheckpointRecord().

From
Fujii Masao
Date:

On 2022/07/22 17:31, Kyotaro Horiguchi wrote:
> I believed that it is recommended to move to the style not having the
> outmost parens.  That style has been introduced by e3a87b4991.

I read the commit log, but I'm not sure yet if it's really recommended to remove extra parens even from the existing
callsto errmsg(). Removing extra parens can interfere with back-patching of the changes around those errmsg(), can't
it?

Anyway, at first I pushed the 0001 patch that removes useless arguments in ReadCheckpointRecord().

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Remove useless arguments in ReadCheckpointRecord().

From
Tom Lane
Date:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
> On 2022/07/22 17:31, Kyotaro Horiguchi wrote:
>> I believed that it is recommended to move to the style not having the
>> outmost parens.  That style has been introduced by e3a87b4991.

> I read the commit log, but I'm not sure yet if it's really recommended to remove extra parens even from the existing
callsto errmsg(). Removing extra parens can interfere with back-patching of the changes around those errmsg(), can't
it?

Right, so I wouldn't be in a hurry to change existing calls.  If you're
editing an ereport call for some other reason, it's fine to remove the
excess parens in it, because you're creating a backpatch hazard there
anyway.  But otherwise, I think such changes are make-work in themselves
and risk creating more make-work for somebody else later.

            regards, tom lane



Re: Remove useless arguments in ReadCheckpointRecord().

From
Bharath Rupireddy
Date:
On Thu, Jul 21, 2022 at 11:24 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> > > May be unrelated, IIRC, for the errors like ereport(PANIC,
> > > (errmsg("could not locate a valid checkpoint record"))); we wanted to
> > > add a hint asking users to consider running pg_resetwal to fix the
> > > issue. The error for ReadCheckpointRecord() in backup_label file
> > > cases, already gives a hint errhint("If you are restoring from a
> > > backup, touch \"%s/recovery.signal\" ...... Adding the hint of running
> > > pg_resetwal (of course with a caution that it can cause inconsistency
> > > in the data and use it as a last resort as described in the docs)
> > > helps users and support engineers a lot to mitigate the server down
> > > cases quickly.
> >
> > +1 to add some hint messages. But I'm not sure if it's good to hint
> > the use of pg_resetwal because, as you're saying, it should be used as
> > a last resort and has some big risks like data loss, corruption,
> > etc. That is, I'm concerned about that some users might quickly run
> > pg_resetwal because hint message says that, without reading the docs
> > nor understanding those risks.
>
> I don't think we want to recommend pg_resetwal to those who don't
> reach it by themselves by other means.  I know of a few instances of
> people who had the tool (unrecoverably) break their own clusters.

Agree. We might want to take this topic separately as it needs more
careful study of common issues such as PANICs and then adding hints
with proven ways to repair the server and bring it back online.

Regards,
Bharath Rupireddy.



Re: Remove useless arguments in ReadCheckpointRecord().

From
Kyotaro Horiguchi
Date:
At Sun, 24 Jul 2022 22:40:16 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Fujii Masao <masao.fujii@oss.nttdata.com> writes:
> > On 2022/07/22 17:31, Kyotaro Horiguchi wrote:
> >> I believed that it is recommended to move to the style not having the
> >> outmost parens.  That style has been introduced by e3a87b4991.
> 
> > I read the commit log, but I'm not sure yet if it's really recommended to remove extra parens even from the
existingcalls to errmsg(). Removing extra parens can interfere with back-patching of the changes around those errmsg(),
can'tit?
 
> 
> Right, so I wouldn't be in a hurry to change existing calls.  If you're
> editing an ereport call for some other reason, it's fine to remove the
> excess parens in it, because you're creating a backpatch hazard there
> anyway.  But otherwise, I think such changes are make-work in themselves
> and risk creating more make-work for somebody else later.

So, I meant to propose to remove extra parens for errmsg()'s where the
message string is edited.  Is it fine in that criteria?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Remove useless arguments in ReadCheckpointRecord().

From
Fujii Masao
Date:

On 2022/07/26 9:42, Kyotaro Horiguchi wrote:
> At Sun, 24 Jul 2022 22:40:16 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
>> Fujii Masao <masao.fujii@oss.nttdata.com> writes:
>>> On 2022/07/22 17:31, Kyotaro Horiguchi wrote:
>>>> I believed that it is recommended to move to the style not having the
>>>> outmost parens.  That style has been introduced by e3a87b4991.
>>
>>> I read the commit log, but I'm not sure yet if it's really recommended to remove extra parens even from the
existingcalls to errmsg(). Removing extra parens can interfere with back-patching of the changes around those errmsg(),
can'tit?
 
>>
>> Right, so I wouldn't be in a hurry to change existing calls.  If you're
>> editing an ereport call for some other reason, it's fine to remove the
>> excess parens in it, because you're creating a backpatch hazard there
>> anyway.  But otherwise, I think such changes are make-work in themselves
>> and risk creating more make-work for somebody else later.
> 
> So, I meant to propose to remove extra parens for errmsg()'s where the
> message string is edited.  Is it fine in that criteria?

Even in that case, removing parens may interfere with the back-patch. For example, please imagine the case where
wasShutdownis changed to be set to true in the following code and this changed is back-patched to v15. If we modify
onlythe log message in the following errmsg() and leave the parens around that, git cherry-pick of the change of
wasShutdownto v15 would be completed successfully. But if we remove the parens, git cherry-pick would fail.
 

    ereport(FATAL,
            (errmsg("could not locate required checkpoint record"),
             errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery
options.\n"
                     "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n"
                     "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a
backup.",
                     DataDir, DataDir, DataDir)));
    wasShutdown = false;    /* keep compiler quiet */

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION