Thread: Add information to rm_redo_error_callback()

Add information to rm_redo_error_callback()

From
"Drouvot, Bertrand"
Date:
Hi hackers,

I've attached a small patch to add information to rm_redo_error_callback().

The changes attached in this patch came while working on the "Add 
information during standby recovery conflicts" patch (See [1]).

The goal is to add more information during the callback (if doable), so 
that something like:

2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for 
Heap2/CLEAN: remxid 1168

would get extra information that way:

2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for 
Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0

As this could be useful outside of [1], a dedicated "sub" patch has been 
created (thanks Sawada for the suggestion).

I will add this patch to the next commitfest. I look forward to your 
feedback about the idea and/or implementation.

Regards,

Bertrand

[1]: https://commitfest.postgresql.org/29/2604


Attachment

Re: Add information to rm_redo_error_callback()

From
Masahiko Sawada
Date:
On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>
> Hi hackers,
>
> I've attached a small patch to add information to rm_redo_error_callback().
>
> The changes attached in this patch came while working on the "Add
> information during standby recovery conflicts" patch (See [1]).
>
> The goal is to add more information during the callback (if doable), so
> that something like:
>
> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for
> Heap2/CLEAN: remxid 1168
>
> would get extra information that way:
>
> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for
> Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0
>
> As this could be useful outside of [1], a dedicated "sub" patch has been
> created (thanks Sawada for the suggestion).
>
> I will add this patch to the next commitfest. I look forward to your
> feedback about the idea and/or implementation.
>

Thank you for starting the new thread for this patch!

I think this patch is simple enough and improves information shown in
errcontext.

I have two comments on the patch:

diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 756b838e6a..8b2024e9e9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg)
 {
    XLogReaderState *record = (XLogReaderState *) arg;
    StringInfoData buf;
+   int block_id;
+   RelFileNode rnode;
+   ForkNumber forknum;
+   BlockNumber blknum;

    initStringInfo(&buf);
    xlog_outdesc(&buf, record);

+   for (block_id = 0; block_id <= record->max_block_id; block_id++)
+   {
+       if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum))
+           appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u",
+                       block_id, rnode.spcNode, rnode.dbNode,
+                       rnode.relNode, forkNames[forknum],
+                       blknum);
+   }
    /* translator: %s is a WAL record description */
    errcontext("WAL redo at %X/%X for %s",
               (uint32) (record->ReadRecPtr >> 32),

rnode, forknum and blknum can be declared within the for loop.

I think it's better to put a new line just before the comment starting
from "translator:".

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add information to rm_redo_error_callback()

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/10/20 7:10 AM, Masahiko Sawada wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
>
>
>
> On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>> Hi hackers,
>>
>> I've attached a small patch to add information to rm_redo_error_callback().
>>
>> The changes attached in this patch came while working on the "Add
>> information during standby recovery conflicts" patch (See [1]).
>>
>> The goal is to add more information during the callback (if doable), so
>> that something like:
>>
>> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for
>> Heap2/CLEAN: remxid 1168
>>
>> would get extra information that way:
>>
>> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for
>> Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0
>>
>> As this could be useful outside of [1], a dedicated "sub" patch has been
>> created (thanks Sawada for the suggestion).
>>
>> I will add this patch to the next commitfest. I look forward to your
>> feedback about the idea and/or implementation.
>>
> Thank you for starting the new thread for this patch!
>
> I think this patch is simple enough and improves information shown in
> errcontext.
>
> I have two comments on the patch:
>
> diff --git a/src/backend/access/transam/xlog.c
> b/src/backend/access/transam/xlog.c
> index 756b838e6a..8b2024e9e9 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg)
>   {
>      XLogReaderState *record = (XLogReaderState *) arg;
>      StringInfoData buf;
> +   int block_id;
> +   RelFileNode rnode;
> +   ForkNumber forknum;
> +   BlockNumber blknum;
>
>      initStringInfo(&buf);
>      xlog_outdesc(&buf, record);
>
> +   for (block_id = 0; block_id <= record->max_block_id; block_id++)
> +   {
> +       if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum))
> +           appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u",
> +                       block_id, rnode.spcNode, rnode.dbNode,
> +                       rnode.relNode, forkNames[forknum],
> +                       blknum);
> +   }
>      /* translator: %s is a WAL record description */
>      errcontext("WAL redo at %X/%X for %s",
>                 (uint32) (record->ReadRecPtr >> 32),
>
> rnode, forknum and blknum can be declared within the for loop.
>
> I think it's better to put a new line just before the comment starting
> from "translator:".

Thanks for looking at it!

I've attached a new version as per your comments.

Thanks,

Bertrand


Attachment

Re: Add information to rm_redo_error_callback()

From
Masahiko Sawada
Date:
On Tue, 11 Aug 2020 at 00:07, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>
> Hi,
>
> On 8/10/20 7:10 AM, Masahiko Sawada wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
> >
> >
> >
> > On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
> >> Hi hackers,
> >>
> >> I've attached a small patch to add information to rm_redo_error_callback().
> >>
> >> The changes attached in this patch came while working on the "Add
> >> information during standby recovery conflicts" patch (See [1]).
> >>
> >> The goal is to add more information during the callback (if doable), so
> >> that something like:
> >>
> >> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for
> >> Heap2/CLEAN: remxid 1168
> >>
> >> would get extra information that way:
> >>
> >> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT:  WAL redo at 0/4A3B0DE0 for
> >> Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0
> >>
> >> As this could be useful outside of [1], a dedicated "sub" patch has been
> >> created (thanks Sawada for the suggestion).
> >>
> >> I will add this patch to the next commitfest. I look forward to your
> >> feedback about the idea and/or implementation.
> >>
> > Thank you for starting the new thread for this patch!
> >
> > I think this patch is simple enough and improves information shown in
> > errcontext.
> >
> > I have two comments on the patch:
> >
> > diff --git a/src/backend/access/transam/xlog.c
> > b/src/backend/access/transam/xlog.c
> > index 756b838e6a..8b2024e9e9 100644
> > --- a/src/backend/access/transam/xlog.c
> > +++ b/src/backend/access/transam/xlog.c
> > @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg)
> >   {
> >      XLogReaderState *record = (XLogReaderState *) arg;
> >      StringInfoData buf;
> > +   int block_id;
> > +   RelFileNode rnode;
> > +   ForkNumber forknum;
> > +   BlockNumber blknum;
> >
> >      initStringInfo(&buf);
> >      xlog_outdesc(&buf, record);
> >
> > +   for (block_id = 0; block_id <= record->max_block_id; block_id++)
> > +   {
> > +       if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum))
> > +           appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u",
> > +                       block_id, rnode.spcNode, rnode.dbNode,
> > +                       rnode.relNode, forkNames[forknum],
> > +                       blknum);
> > +   }
> >      /* translator: %s is a WAL record description */
> >      errcontext("WAL redo at %X/%X for %s",
> >                 (uint32) (record->ReadRecPtr >> 32),
> >
> > rnode, forknum and blknum can be declared within the for loop.
> >
> > I think it's better to put a new line just before the comment starting
> > from "translator:".
>
> Thanks for looking at it!
>
> I've attached a new version as per your comments.

Thank you for updating the patch!

The patch looks good to me. I've set this patch as Ready for Committer.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add information to rm_redo_error_callback()

From
Michael Paquier
Date:
On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote:
> Thank you for updating the patch!
>
> The patch looks good to me. I've set this patch as Ready for Committer.

+       for (block_id = 0; block_id <= record->max_block_id; block_id++)
+       {
+               RelFileNode rnode;
+               ForkNumber forknum;
+               BlockNumber blknum;

Doesn't this potentially create duplicate information in some of the
RM's desc() callbacks, and are we sure that the information provided
is worth having for all the RMs?  As one example, gin_desc() looks at
some of the block information, so there are overlaps.  It may be
worth thinking about showing more information for has_image and
apply_image if a block is in_use?
--
Michael

Attachment

Re: Add information to rm_redo_error_callback()

From
Masahiko Sawada
Date:
On Tue, 11 Aug 2020 at 15:30, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote:
> > Thank you for updating the patch!
> >
> > The patch looks good to me. I've set this patch as Ready for Committer.
>
> +       for (block_id = 0; block_id <= record->max_block_id; block_id++)
> +       {
> +               RelFileNode rnode;
> +               ForkNumber forknum;
> +               BlockNumber blknum;
>
> Doesn't this potentially create duplicate information in some of the
> RM's desc() callbacks, and are we sure that the information provided
> is worth having for all the RMs?  As one example, gin_desc() looks at
> some of the block information, so there are overlaps.

Yeah, there is duplicate information in some RMs. I thought that we
can change individual RM’s desc() functions to output the block
information but as long as I see the pg_waldump outputs these are not
annoying to me and many of RM’s desc() doesn’t show the block
information.

>  It may be
> worth thinking about showing more information for has_image and
> apply_image if a block is in_use?

Yes. I’m okay with adding information for has_image and apply_image
but IMHO I'm not sure how these shown in errcontext would help. If an
error related to has_image or apply_image happens, errmsg should show
something detailed information about FPI.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add information to rm_redo_error_callback()

From
"Drouvot, Bertrand"
Date:
Hi,

Thanks for the feedback.

On 8/11/20 12:03 PM, Masahiko Sawada wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
>
>
>
> On Tue, 11 Aug 2020 at 15:30, Michael Paquier <michael@paquier.xyz> wrote:
>> On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote:
>>> Thank you for updating the patch!
>>>
>>> The patch looks good to me. I've set this patch as Ready for Committer.
>> +       for (block_id = 0; block_id <= record->max_block_id; block_id++)
>> +       {
>> +               RelFileNode rnode;
>> +               ForkNumber forknum;
>> +               BlockNumber blknum;
>>
>> Doesn't this potentially create duplicate information in some of the
>> RM's desc() callbacks, and are we sure that the information provided
>> is worth having for all the RMs?  As one example, gin_desc() looks at
>> some of the block information, so there are overlaps.
> Yeah, there is duplicate information in some RMs. I thought that we
> can change individual RM’s desc() functions to output the block
> information but as long as I see the pg_waldump outputs these are not
> annoying to me and many of RM’s desc() doesn’t show the block
> information.

Having this "pg_waldump" kind of format in this place 
(rm_redo_error_callback()) ensures that we'll always see the same piece 
of information during rm_redo.

I think it's good to guarantee that we'll always see the same piece of 
information (should a new RM desc() be created in the future for 
example), even if it could lead to some information overlap in some cases.

>>   It may be
>> worth thinking about showing more information for has_image and
>> apply_image if a block is in_use?
> Yes. I’m okay with adding information for has_image and apply_image
> but IMHO I'm not sure how these shown in errcontext would help. If an
> error related to has_image or apply_image happens, errmsg should show
> something detailed information about FPI.

I am ok too, but I am also not sure that errcontext is the right place 
for that.

Thanks

Bertrand

>
> Regards,
>
> --
> Masahiko Sawada            http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add information to rm_redo_error_callback()

From
Alvaro Herrera
Date:
On 2020-Aug-17, Drouvot, Bertrand wrote:

> Having this "pg_waldump" kind of format in this place
> (rm_redo_error_callback()) ensures that we'll always see the same piece of
> information during rm_redo.
> 
> I think it's good to guarantee that we'll always see the same piece of
> information (should a new RM desc() be created in the future for example),
> even if it could lead to some information overlap in some cases.

I agree.

I think we should treat the changes to remove rm_desc-specific info
items that are redundant as separate improvements that don't need to
block this patch.  They would be, at worst, only minor annoyances.
And the removal, as was said, can affect other things that we might want
to think about separately.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add information to rm_redo_error_callback()

From
Michael Paquier
Date:
On Mon, Aug 17, 2020 at 05:47:13PM +0200, Drouvot, Bertrand wrote:
> I think it's good to guarantee that we'll always see the same piece of
> information (should a new RM desc() be created in the future for example),
> even if it could lead to some information overlap in some cases.

> I am ok too, but I am also not sure that errcontext is the right place for
> that.

Hmm.  I still think that knowing at least about a FPW could be an
interesting piece of information even here.  Anyway, instead of
copying a logic that exists already in xlog_outrec(), why not moving
the block information print into a separate routine out of the
WAL_DEBUG section, and just reuse the same format for the context of
the redo error callback?  That would also be more consistent with what
we do in pg_waldump where we don't show the fork name of a block when
it is on a MAIN_FORKNUM.  And this would avoid a third copy of the
same logic.  If we add the XID, previous LSN and the record length
on the stack of what is printed, we could just reuse the existing
routine, still that's perhaps too much information displayed.
--
Michael

Attachment

Re: Add information to rm_redo_error_callback()

From
Michael Paquier
Date:
On Thu, Sep 24, 2020 at 03:03:46PM +0900, Michael Paquier wrote:
> Hmm.  I still think that knowing at least about a FPW could be an
> interesting piece of information even here.  Anyway, instead of
> copying a logic that exists already in xlog_outrec(), why not moving
> the block information print into a separate routine out of the
> WAL_DEBUG section, and just reuse the same format for the context of
> the redo error callback?  That would also be more consistent with what
> we do in pg_waldump where we don't show the fork name of a block when
> it is on a MAIN_FORKNUM.  And this would avoid a third copy of the
> same logic.  If we add the XID, previous LSN and the record length
> on the stack of what is printed, we could just reuse the existing
> routine, still that's perhaps too much information displayed.

Seeing nothing, I took a swing at that, and finished with the
attached that refactors the logic and prints the block information as
wanted.  Any objections to that?
--
Michael

Attachment

Re: Add information to rm_redo_error_callback()

From
"Drouvot, Bertrand"
Date:
Hi,

On 10/1/20 9:41 AM, Michael Paquier wrote:
> On Thu, Sep 24, 2020 at 03:03:46PM +0900, Michael Paquier wrote:
>> Hmm.  I still think that knowing at least about a FPW could be an
>> interesting piece of information even here.  Anyway, instead of
>> copying a logic that exists already in xlog_outrec(), why not moving
>> the block information print into a separate routine out of the
>> WAL_DEBUG section, and just reuse the same format for the context of
>> the redo error callback?  That would also be more consistent with what
>> we do in pg_waldump where we don't show the fork name of a block when
>> it is on a MAIN_FORKNUM.  And this would avoid a third copy of the
>> same logic.  If we add the XID, previous LSN and the record length
>> on the stack of what is printed, we could just reuse the existing
>> routine, still that's perhaps too much information displayed.
> Seeing nothing, I took a swing at that, and finished with the
> attached that refactors the logic and prints the block information as
> wanted.  Any objections to that?

Sorry for the late reply and thanks for looking at it!

Had a look at it and did a few tests: looks all good to me.

No objections at all, thanks!

Bertrand

> --
> Michael



Re: Add information to rm_redo_error_callback()

From
Michael Paquier
Date:
On Thu, Oct 01, 2020 at 11:18:30AM +0200, Drouvot, Bertrand wrote:
> Had a look at it and did a few tests: looks all good to me.
>
> No objections at all, thanks!

Thanks for double-checking.  Applied, then.
--
Michael

Attachment

Re: Add information to rm_redo_error_callback()

From
"Drouvot, Bertrand"
Date:
On 10/2/20 2:47 AM, Michael Paquier wrote:
> On Thu, Oct 01, 2020 at 11:18:30AM +0200, Drouvot, Bertrand wrote:
>> Had a look at it and did a few tests: looks all good to me.
>>
>> No objections at all, thanks!
> Thanks for double-checking.  Applied, then.

Thanks!

Bertrand