Thread: Add information to rm_redo_error_callback()
Hi hackers, I've attached a small patch to add information to rm_redo_error_callback(). The changes attached in this patch came while working on the "Add information during standby recovery conflicts" patch (See [1]). The goal is to add more information during the callback (if doable), so that something like: 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for Heap2/CLEAN: remxid 1168 would get extra information that way: 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0 As this could be useful outside of [1], a dedicated "sub" patch has been created (thanks Sawada for the suggestion). I will add this patch to the next commitfest. I look forward to your feedback about the idea and/or implementation. Regards, Bertrand [1]: https://commitfest.postgresql.org/29/2604
Attachment
On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi hackers, > > I've attached a small patch to add information to rm_redo_error_callback(). > > The changes attached in this patch came while working on the "Add > information during standby recovery conflicts" patch (See [1]). > > The goal is to add more information during the callback (if doable), so > that something like: > > 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for > Heap2/CLEAN: remxid 1168 > > would get extra information that way: > > 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for > Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0 > > As this could be useful outside of [1], a dedicated "sub" patch has been > created (thanks Sawada for the suggestion). > > I will add this patch to the next commitfest. I look forward to your > feedback about the idea and/or implementation. > Thank you for starting the new thread for this patch! I think this patch is simple enough and improves information shown in errcontext. I have two comments on the patch: diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 756b838e6a..8b2024e9e9 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg) { XLogReaderState *record = (XLogReaderState *) arg; StringInfoData buf; + int block_id; + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blknum; initStringInfo(&buf); xlog_outdesc(&buf, record); + for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum)) + appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u", + block_id, rnode.spcNode, rnode.dbNode, + rnode.relNode, forkNames[forknum], + blknum); + } /* translator: %s is a WAL record description */ errcontext("WAL redo at %X/%X for %s", (uint32) (record->ReadRecPtr >> 32), rnode, forknum and blknum can be declared within the for loop. I think it's better to put a new line just before the comment starting from "translator:". Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 8/10/20 7:10 AM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: >> Hi hackers, >> >> I've attached a small patch to add information to rm_redo_error_callback(). >> >> The changes attached in this patch came while working on the "Add >> information during standby recovery conflicts" patch (See [1]). >> >> The goal is to add more information during the callback (if doable), so >> that something like: >> >> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for >> Heap2/CLEAN: remxid 1168 >> >> would get extra information that way: >> >> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for >> Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0 >> >> As this could be useful outside of [1], a dedicated "sub" patch has been >> created (thanks Sawada for the suggestion). >> >> I will add this patch to the next commitfest. I look forward to your >> feedback about the idea and/or implementation. >> > Thank you for starting the new thread for this patch! > > I think this patch is simple enough and improves information shown in > errcontext. > > I have two comments on the patch: > > diff --git a/src/backend/access/transam/xlog.c > b/src/backend/access/transam/xlog.c > index 756b838e6a..8b2024e9e9 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg) > { > XLogReaderState *record = (XLogReaderState *) arg; > StringInfoData buf; > + int block_id; > + RelFileNode rnode; > + ForkNumber forknum; > + BlockNumber blknum; > > initStringInfo(&buf); > xlog_outdesc(&buf, record); > > + for (block_id = 0; block_id <= record->max_block_id; block_id++) > + { > + if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum)) > + appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u", > + block_id, rnode.spcNode, rnode.dbNode, > + rnode.relNode, forkNames[forknum], > + blknum); > + } > /* translator: %s is a WAL record description */ > errcontext("WAL redo at %X/%X for %s", > (uint32) (record->ReadRecPtr >> 32), > > rnode, forknum and blknum can be declared within the for loop. > > I think it's better to put a new line just before the comment starting > from "translator:". Thanks for looking at it! I've attached a new version as per your comments. Thanks, Bertrand
Attachment
On Tue, 11 Aug 2020 at 00:07, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 8/10/20 7:10 AM, Masahiko Sawada wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > > > > > On Wed, 5 Aug 2020 at 00:37, Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > >> Hi hackers, > >> > >> I've attached a small patch to add information to rm_redo_error_callback(). > >> > >> The changes attached in this patch came while working on the "Add > >> information during standby recovery conflicts" patch (See [1]). > >> > >> The goal is to add more information during the callback (if doable), so > >> that something like: > >> > >> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for > >> Heap2/CLEAN: remxid 1168 > >> > >> would get extra information that way: > >> > >> 2020-08-04 14:42:57.545 UTC [15459] CONTEXT: WAL redo at 0/4A3B0DE0 for > >> Heap2/CLEAN: remxid 1168, blkref #0: rel 1663/13586/16850 fork main blk 0 > >> > >> As this could be useful outside of [1], a dedicated "sub" patch has been > >> created (thanks Sawada for the suggestion). > >> > >> I will add this patch to the next commitfest. I look forward to your > >> feedback about the idea and/or implementation. > >> > > Thank you for starting the new thread for this patch! > > > > I think this patch is simple enough and improves information shown in > > errcontext. > > > > I have two comments on the patch: > > > > diff --git a/src/backend/access/transam/xlog.c > > b/src/backend/access/transam/xlog.c > > index 756b838e6a..8b2024e9e9 100644 > > --- a/src/backend/access/transam/xlog.c > > +++ b/src/backend/access/transam/xlog.c > > @@ -11749,10 +11749,22 @@ rm_redo_error_callback(void *arg) > > { > > XLogReaderState *record = (XLogReaderState *) arg; > > StringInfoData buf; > > + int block_id; > > + RelFileNode rnode; > > + ForkNumber forknum; > > + BlockNumber blknum; > > > > initStringInfo(&buf); > > xlog_outdesc(&buf, record); > > > > + for (block_id = 0; block_id <= record->max_block_id; block_id++) > > + { > > + if (XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blknum)) > > + appendStringInfo(&buf,", blkref #%d: rel %u/%u/%u fork %s blk %u", > > + block_id, rnode.spcNode, rnode.dbNode, > > + rnode.relNode, forkNames[forknum], > > + blknum); > > + } > > /* translator: %s is a WAL record description */ > > errcontext("WAL redo at %X/%X for %s", > > (uint32) (record->ReadRecPtr >> 32), > > > > rnode, forknum and blknum can be declared within the for loop. > > > > I think it's better to put a new line just before the comment starting > > from "translator:". > > Thanks for looking at it! > > I've attached a new version as per your comments. Thank you for updating the patch! The patch looks good to me. I've set this patch as Ready for Committer. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote: > Thank you for updating the patch! > > The patch looks good to me. I've set this patch as Ready for Committer. + for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blknum; Doesn't this potentially create duplicate information in some of the RM's desc() callbacks, and are we sure that the information provided is worth having for all the RMs? As one example, gin_desc() looks at some of the block information, so there are overlaps. It may be worth thinking about showing more information for has_image and apply_image if a block is in_use? -- Michael
Attachment
On Tue, 11 Aug 2020 at 15:30, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote: > > Thank you for updating the patch! > > > > The patch looks good to me. I've set this patch as Ready for Committer. > > + for (block_id = 0; block_id <= record->max_block_id; block_id++) > + { > + RelFileNode rnode; > + ForkNumber forknum; > + BlockNumber blknum; > > Doesn't this potentially create duplicate information in some of the > RM's desc() callbacks, and are we sure that the information provided > is worth having for all the RMs? As one example, gin_desc() looks at > some of the block information, so there are overlaps. Yeah, there is duplicate information in some RMs. I thought that we can change individual RM’s desc() functions to output the block information but as long as I see the pg_waldump outputs these are not annoying to me and many of RM’s desc() doesn’t show the block information. > It may be > worth thinking about showing more information for has_image and > apply_image if a block is in_use? Yes. I’m okay with adding information for has_image and apply_image but IMHO I'm not sure how these shown in errcontext would help. If an error related to has_image or apply_image happens, errmsg should show something detailed information about FPI. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, Thanks for the feedback. On 8/11/20 12:03 PM, Masahiko Sawada wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. > > > > On Tue, 11 Aug 2020 at 15:30, Michael Paquier <michael@paquier.xyz> wrote: >> On Tue, Aug 11, 2020 at 02:45:50PM +0900, Masahiko Sawada wrote: >>> Thank you for updating the patch! >>> >>> The patch looks good to me. I've set this patch as Ready for Committer. >> + for (block_id = 0; block_id <= record->max_block_id; block_id++) >> + { >> + RelFileNode rnode; >> + ForkNumber forknum; >> + BlockNumber blknum; >> >> Doesn't this potentially create duplicate information in some of the >> RM's desc() callbacks, and are we sure that the information provided >> is worth having for all the RMs? As one example, gin_desc() looks at >> some of the block information, so there are overlaps. > Yeah, there is duplicate information in some RMs. I thought that we > can change individual RM’s desc() functions to output the block > information but as long as I see the pg_waldump outputs these are not > annoying to me and many of RM’s desc() doesn’t show the block > information. Having this "pg_waldump" kind of format in this place (rm_redo_error_callback()) ensures that we'll always see the same piece of information during rm_redo. I think it's good to guarantee that we'll always see the same piece of information (should a new RM desc() be created in the future for example), even if it could lead to some information overlap in some cases. >> It may be >> worth thinking about showing more information for has_image and >> apply_image if a block is in_use? > Yes. I’m okay with adding information for has_image and apply_image > but IMHO I'm not sure how these shown in errcontext would help. If an > error related to has_image or apply_image happens, errmsg should show > something detailed information about FPI. I am ok too, but I am also not sure that errcontext is the right place for that. Thanks Bertrand > > Regards, > > -- > Masahiko Sawada http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Aug-17, Drouvot, Bertrand wrote: > Having this "pg_waldump" kind of format in this place > (rm_redo_error_callback()) ensures that we'll always see the same piece of > information during rm_redo. > > I think it's good to guarantee that we'll always see the same piece of > information (should a new RM desc() be created in the future for example), > even if it could lead to some information overlap in some cases. I agree. I think we should treat the changes to remove rm_desc-specific info items that are redundant as separate improvements that don't need to block this patch. They would be, at worst, only minor annoyances. And the removal, as was said, can affect other things that we might want to think about separately. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Aug 17, 2020 at 05:47:13PM +0200, Drouvot, Bertrand wrote: > I think it's good to guarantee that we'll always see the same piece of > information (should a new RM desc() be created in the future for example), > even if it could lead to some information overlap in some cases. > I am ok too, but I am also not sure that errcontext is the right place for > that. Hmm. I still think that knowing at least about a FPW could be an interesting piece of information even here. Anyway, instead of copying a logic that exists already in xlog_outrec(), why not moving the block information print into a separate routine out of the WAL_DEBUG section, and just reuse the same format for the context of the redo error callback? That would also be more consistent with what we do in pg_waldump where we don't show the fork name of a block when it is on a MAIN_FORKNUM. And this would avoid a third copy of the same logic. If we add the XID, previous LSN and the record length on the stack of what is printed, we could just reuse the existing routine, still that's perhaps too much information displayed. -- Michael
Attachment
On Thu, Sep 24, 2020 at 03:03:46PM +0900, Michael Paquier wrote: > Hmm. I still think that knowing at least about a FPW could be an > interesting piece of information even here. Anyway, instead of > copying a logic that exists already in xlog_outrec(), why not moving > the block information print into a separate routine out of the > WAL_DEBUG section, and just reuse the same format for the context of > the redo error callback? That would also be more consistent with what > we do in pg_waldump where we don't show the fork name of a block when > it is on a MAIN_FORKNUM. And this would avoid a third copy of the > same logic. If we add the XID, previous LSN and the record length > on the stack of what is printed, we could just reuse the existing > routine, still that's perhaps too much information displayed. Seeing nothing, I took a swing at that, and finished with the attached that refactors the logic and prints the block information as wanted. Any objections to that? -- Michael
Attachment
Hi, On 10/1/20 9:41 AM, Michael Paquier wrote: > On Thu, Sep 24, 2020 at 03:03:46PM +0900, Michael Paquier wrote: >> Hmm. I still think that knowing at least about a FPW could be an >> interesting piece of information even here. Anyway, instead of >> copying a logic that exists already in xlog_outrec(), why not moving >> the block information print into a separate routine out of the >> WAL_DEBUG section, and just reuse the same format for the context of >> the redo error callback? That would also be more consistent with what >> we do in pg_waldump where we don't show the fork name of a block when >> it is on a MAIN_FORKNUM. And this would avoid a third copy of the >> same logic. If we add the XID, previous LSN and the record length >> on the stack of what is printed, we could just reuse the existing >> routine, still that's perhaps too much information displayed. > Seeing nothing, I took a swing at that, and finished with the > attached that refactors the logic and prints the block information as > wanted. Any objections to that? Sorry for the late reply and thanks for looking at it! Had a look at it and did a few tests: looks all good to me. No objections at all, thanks! Bertrand > -- > Michael
On Thu, Oct 01, 2020 at 11:18:30AM +0200, Drouvot, Bertrand wrote: > Had a look at it and did a few tests: looks all good to me. > > No objections at all, thanks! Thanks for double-checking. Applied, then. -- Michael
Attachment
On 10/2/20 2:47 AM, Michael Paquier wrote: > On Thu, Oct 01, 2020 at 11:18:30AM +0200, Drouvot, Bertrand wrote: >> Had a look at it and did a few tests: looks all good to me. >> >> No objections at all, thanks! > Thanks for double-checking. Applied, then. Thanks! Bertrand