Thread: Add checkpoint and redo LSN to LogCheckpointEnd log message
Hi, It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN and REDO LSN [1]. It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. Attaching a small patch herewith. Thoughts? [1] 2021-12-23 14:58:54.694 UTC [1965649] LOG: checkpoint starting: shutdown immediate 2021-12-23 14:58:54.714 UTC [1965649] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.025 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB; LSN=0/14D0AD0, REDO LSN=0/14D0AD0 Regards, Bharath Rupireddy.
Attachment
At Thu, 23 Dec 2021 20:35:54 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > Hi, > > It is useful (for debugging purposes) if the checkpoint end message > has the checkpoint LSN and REDO LSN [1]. It gives more context while > analyzing checkpoint-related issues. The pg_controldata gives the last > checkpoint LSN and REDO LSN, but having this info alongside the log > message helps analyze issues that happened previously, connect the > dots and identify the root cause. > > Attaching a small patch herewith. Thoughts? A big +1 from me. I thought about proposing the same in the past. > [1] > 2021-12-23 14:58:54.694 UTC [1965649] LOG: checkpoint starting: > shutdown immediate > 2021-12-23 14:58:54.714 UTC [1965649] LOG: checkpoint complete: wrote > 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; > write=0.001 s, sync=0.001 s, total=0.025 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB; > LSN=0/14D0AD0, REDO LSN=0/14D0AD0 I thougt about something like the following, but your proposal may be clearer. > WAL range=[0/14D0340, 0/14D0AD0] regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Dec 24, 2021 at 11:21 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Thu, 23 Dec 2021 20:35:54 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > Hi, > > > > It is useful (for debugging purposes) if the checkpoint end message > > has the checkpoint LSN and REDO LSN [1]. It gives more context while > > analyzing checkpoint-related issues. The pg_controldata gives the last > > checkpoint LSN and REDO LSN, but having this info alongside the log > > message helps analyze issues that happened previously, connect the > > dots and identify the root cause. > > > > Attaching a small patch herewith. Thoughts? > > A big +1 from me. I thought about proposing the same in the past. Thanks for the review. I've added a CF entry to not lose track - https://commitfest.postgresql.org/36/3474/. > > [1] > > 2021-12-23 14:58:54.694 UTC [1965649] LOG: checkpoint starting: > > shutdown immediate > > 2021-12-23 14:58:54.714 UTC [1965649] LOG: checkpoint complete: wrote > > 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; > > write=0.001 s, sync=0.001 s, total=0.025 s; sync files=0, > > longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB; > > LSN=0/14D0AD0, REDO LSN=0/14D0AD0 > > I thougt about something like the following, but your proposal may be > clearer. > > > WAL range=[0/14D0340, 0/14D0AD0] Yeah the proposed in the v1 is clear saying checkpoint/restartpoint LSN and REDO LSN. Regards, Bharath Rupireddy.
On Fri, Dec 24, 2021 at 5:42 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Dec 24, 2021 at 02:51:34PM +0900, Kyotaro Horiguchi wrote: > > I thougt about something like the following, but your proposal may be > > clearer. > > + "LSN=%X/%X, REDO LSN=%X/%X", > This could be rather confusing for the average user, even if I agree > that this is some information that only an advanced user could > understand. Could it be possible to define those fields in a more > deterministic way? For one, it is hard to understand the relationship > between both fields without looking at the code, particulary if both > share the same value. It is at least rather.. Well, mostly, easy to > guess what each other field means in this context, which is not the > case of what you are proposing here. One idea could be use also > "start point" for REDO, for example. How about "location=%X/%X, REDO start location=%X/%X"? The entire log message can look like below: 2021-12-24 12:20:19.140 UTC [1977834] LOG: checkpoint complete:location=%X/%X, REDO start location=%X/%X; wrote 7 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007 s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s; distance=293 kB, estimate=56584 kB Another variant: 2021-12-24 12:20:19.140 UTC [1977834] LOG: checkpoint completed at location=%X/%X with REDO start location=%X/%X: wrote 7 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007 s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s; distance=293 kB, estimate=56584 kB 2021-12-24 12:20:19.140 UTC [1977834] LOG: restartpoint completed at location=%X/%X with REDO start location=%X/%X: wrote 7 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007 s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s; distance=293 kB, estimate=56584 kB Thoughts? Regards, Bharath Rupireddy.
On Fri, Dec 24, 2021 at 5:54 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Dec 24, 2021 at 5:42 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Fri, Dec 24, 2021 at 02:51:34PM +0900, Kyotaro Horiguchi wrote: > > > I thougt about something like the following, but your proposal may be > > > clearer. > > > > + "LSN=%X/%X, REDO LSN=%X/%X", > > This could be rather confusing for the average user, even if I agree > > that this is some information that only an advanced user could > > understand. Could it be possible to define those fields in a more > > deterministic way? For one, it is hard to understand the relationship > > between both fields without looking at the code, particulary if both > > share the same value. It is at least rather.. Well, mostly, easy to > > guess what each other field means in this context, which is not the > > case of what you are proposing here. One idea could be use also > > "start point" for REDO, for example. > > How about "location=%X/%X, REDO start location=%X/%X"? The entire log > message can look like below: > > 2021-12-24 12:20:19.140 UTC [1977834] LOG: checkpoint > complete:location=%X/%X, REDO start location=%X/%X; wrote 7 buffers > (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, > sync=0.007 s, total=0.192 s; sync files=5, longest=0.005 s, > average=0.002 s; distance=293 kB, estimate=56584 kB > > Another variant: > 2021-12-24 12:20:19.140 UTC [1977834] LOG: checkpoint completed at > location=%X/%X with REDO start location=%X/%X: wrote 7 buffers (0.0%); > 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007 > s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s; > distance=293 kB, estimate=56584 kB > 2021-12-24 12:20:19.140 UTC [1977834] LOG: restartpoint completed at > location=%X/%X with REDO start location=%X/%X: wrote 7 buffers (0.0%); > 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007 > s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s; > distance=293 kB, estimate=56584 kB Here are the 2 patches. one(v2-1-XXX.patch) adding the info as: 2021-12-28 02:44:34.870 UTC [2384386] LOG: checkpoint complete: location=0/1B03040, REDO start location=0/1B03008; wrote 466 buffers (2.8%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.014 s, sync=0.038 s, total=0.072 s; sync files=21, longest=0.022 s, average=0.002 s; distance=6346 kB, estimate=6346 kB another(v2-2-XXX.patch) adding the info as: 2021-12-28 02:52:24.464 UTC [2394396] LOG: checkpoint completed at location=0/212FFC8 with REDO start location=0/212FF90: wrote 451 buffers (2.8%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.012 s, sync=0.032 s, total=0.071 s; sync files=6, longest=0.022 s, average=0.006 s; distance=6272 kB, estimate=6272 kB attaching v1-0001-XXX from the initial mail again just for the sake of completion: 2021-12-23 14:58:54.714 UTC [1965649] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.025 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB; LSN=0/14D0AD0, REDO LSN=0/14D0AD0 Thoughts? Regards, Bharath Rupireddy.
Attachment
At Tue, 28 Dec 2021 08:26:28 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Fri, Dec 24, 2021 at 5:54 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Fri, Dec 24, 2021 at 5:42 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Fri, Dec 24, 2021 at 02:51:34PM +0900, Kyotaro Horiguchi wrote: > > > > I thougt about something like the following, but your proposal may be > > > > clearer. > > > > > > + "LSN=%X/%X, REDO LSN=%X/%X", > > > This could be rather confusing for the average user, even if I agree > > > that this is some information that only an advanced user could > > > understand. Could it be possible to define those fields in a more > > > deterministic way? For one, it is hard to understand the relationship > > > between both fields without looking at the code, particulary if both > > > share the same value. It is at least rather.. Well, mostly, easy to > > > guess what each other field means in this context, which is not the > > > case of what you are proposing here. One idea could be use also > > > "start point" for REDO, for example. > > > > How about "location=%X/%X, REDO start location=%X/%X"? The entire log > > message can look like below: .. > Thoughts? It seems to me "LSN" or just "location" is more confusing or mysterious than "REDO LSN" for the average user. If we want to avoid being technically too detailed, we would use just "start LSN=%X/%X, end LSN=%X/%X". And it is equivalent to "WAL range=[%X/%X, %X/%X]".. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hi, On Tue, Dec 28, 2021 at 10:56 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > attaching v1-0001-XXX from the initial mail again just for the sake of > completion: Unfortunately this breaks the cfbot as it tries to apply this patch too: http://cfbot.cputube.org/patch_36_3474.log. For this kind of situation I think that the usual solution is to use a .txt extension to make sure that the cfbot won't try to apply it.
cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes: > For this kind of situation I think that the usual solution is to use a > .txt extension to make sure that the cfbot won't try to apply it. Yeah ... this has come up before. Is there a documented way to attach files that the cfbot will ignore? Two specific scenarios seem to be interesting: 1. You are attaching patch(es) plus some non-patch files 2. You are attaching some random files, and would like to not displace the cfbot's idea of the latest patchset. I don't know exactly how to do either of those. regards, tom lane
Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Julien Rouhaud
Date:
On Wed, Jan 12, 2022 at 01:19:22AM -0500, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > For this kind of situation I think that the usual solution is to use a > > .txt extension to make sure that the cfbot won't try to apply it. > > Yeah ... this has come up before. Is there a documented way to > attach files that the cfbot will ignore? Not that I know of unfortunately. I think the apply part is done by https://github.com/macdice/cfbot/blob/master/cfbot_patchburner_chroot_ctl.sh#L93-L120, which seems reasonable. So basically any extension outside of those could be used, excluding of course anything clearly suspicious that would be rejected by many email providers. > Two specific scenarios seem to be interesting: > > 1. You are attaching patch(es) plus some non-patch files > > 2. You are attaching some random files, and would like to not > displace the cfbot's idea of the latest patchset. I'm assuming that someone wanting to send an additional patch to be applied on top of the OP patchset is part of 2?
Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Tom Lane
Date:
Julien Rouhaud <rjuju123@gmail.com> writes: > On Wed, Jan 12, 2022 at 01:19:22AM -0500, Tom Lane wrote: >> 2. You are attaching some random files, and would like to not >> displace the cfbot's idea of the latest patchset. > I'm assuming that someone wanting to send an additional patch to be applied on > top of the OP patchset is part of 2? AFAIK, if you're submitting a patch then you have to attach a complete patchset, or the cfbot will be totally lost. Getting the bot to understand incremental patches would be useful for sure ... but it's outside the scope of what I'm asking for now, which is just clear documentation of what the bot can do already. regards, tom lane
Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Julien Rouhaud
Date:
On Wed, Jan 12, 2022 at 2:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > AFAIK, if you're submitting a patch then you have to attach a complete > patchset, or the cfbot will be totally lost. Getting the bot to > understand incremental patches would be useful for sure ... but it's > outside the scope of what I'm asking for now, which is just clear > documentation of what the bot can do already. Right, but the use case I'm mentioning is a bit different: provide another patch while *not* triggering the cfbot. I've seen cases in the past where people want to share some code to the OP and it seems reasonable to allow that without risking the trigger the cfbot, at least not without the OP validating or adapting the changes.
Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Thomas Munro
Date:
On Wed, Jan 12, 2022 at 7:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Wed, Jan 12, 2022 at 01:19:22AM -0500, Tom Lane wrote: > >> 2. You are attaching some random files, and would like to not > >> displace the cfbot's idea of the latest patchset. > > > I'm assuming that someone wanting to send an additional patch to be applied on > > top of the OP patchset is part of 2? > > AFAIK, if you're submitting a patch then you have to attach a complete > patchset, or the cfbot will be totally lost. Getting the bot to > understand incremental patches would be useful for sure ... but it's > outside the scope of what I'm asking for now, which is just clear > documentation of what the bot can do already. By way of documentation, I've just now tried to answer these question in the new FAQ at: https://wiki.postgresql.org/wiki/Cfbot
Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > By way of documentation, I've just now tried to answer these question > in the new FAQ at: > https://wiki.postgresql.org/wiki/Cfbot Thanks! regards, tom lane
Re: cfbot wrangling (was Re: Add checkpoint and redo LSN to LogCheckpointEnd log message)
From
Julien Rouhaud
Date:
On Wed, Jan 12, 2022 at 2:52 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > By way of documentation, I've just now tried to answer these question > in the new FAQ at: > > https://wiki.postgresql.org/wiki/Cfbot Great! Thanks a lot!
On Wed, Jan 12, 2022 at 11:39 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > > Hi, > > On Tue, Dec 28, 2021 at 10:56 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > attaching v1-0001-XXX from the initial mail again just for the sake of > > completion: > > Unfortunately this breaks the cfbot as it tries to apply this patch > too: http://cfbot.cputube.org/patch_36_3474.log. > > For this kind of situation I think that the usual solution is to use a > .txt extension to make sure that the cfbot won't try to apply it. Thanks. IMO, the following format of logging is better, so attaching the v2-0001-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch as .patch 2021-12-28 02:52:24.464 UTC [2394396] LOG: checkpoint completed at location=0/212FFC8 with REDO start location=0/212FF90: wrote 451 buffers (2.8%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.012 s, sync=0.032 s, total=0.071 s; sync files=6, longest=0.022 s, average=0.006 s; distance=6272 kB, estimate=6272 kB Others are attached as .txt files. Regards, Bharath Rupireddy.
Attachment
On Thu, Jan 13, 2022 at 11:50 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Thanks. IMO, the following format of logging is better, so attaching > the v2-0001-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch as > .patch > > 2021-12-28 02:52:24.464 UTC [2394396] LOG: checkpoint completed at > location=0/212FFC8 with REDO start location=0/212FF90: wrote 451 > buffers (2.8%); 0 WAL file(s) added, 0 removed, 1 recycled; > write=0.012 s, sync=0.032 s, total=0.071 s; sync files=6, > longest=0.022 s, average=0.006 s; distance=6272 kB, estimate=6272 kB One of the test cases was failing with the above style of the log message, changing "checkpoint complete" to "checkpoint completed at location" doesn't seem to be a better idea. It looks like the users or the log monitoring tools might be using the same text "checkpoint complete", therefore I don't want to break that. Here's the v3 patch that I think will work better. Please review. Regards, Bharath Rupireddy.
Attachment
On 1/3/22, 5:52 PM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote: > It seems to me "LSN" or just "location" is more confusing or > mysterious than "REDO LSN" for the average user. If we want to avoid > being technically too detailed, we would use just "start LSN=%X/%X, > end LSN=%X/%X". And it is equivalent to "WAL range=[%X/%X, %X/%X]".. My first instinct was that this should stay aligned with pg_controldata, but that would mean using "location=%X/%X, REDO location=%X/%X," which doesn't seem terribly descriptive. IIUC the "checkpoint location" is the LSN of the WAL record for the checkpoint, and the "checkpoint's REDO location" is the LSN where checkpoint creation began (i.e., what you must retain for crash recovery). My vote is for "start=%X/%X, end=%X/%X." Nathan
At Thu, 20 Jan 2022 00:36:32 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in > On 1/3/22, 5:52 PM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote: > > It seems to me "LSN" or just "location" is more confusing or > > mysterious than "REDO LSN" for the average user. If we want to avoid > > being technically too detailed, we would use just "start LSN=%X/%X, > > end LSN=%X/%X". And it is equivalent to "WAL range=[%X/%X, %X/%X]".. > > My first instinct was that this should stay aligned with > pg_controldata, but that would mean using "location=%X/%X, REDO > location=%X/%X," which doesn't seem terribly descriptive. IIUC the > "checkpoint location" is the LSN of the WAL record for the checkpoint, > and the "checkpoint's REDO location" is the LSN where checkpoint > creation began (i.e., what you must retain for crash recovery). My > vote is for "start=%X/%X, end=%X/%X." +1. Works for me. %X/%X itself expresses it is an LSN. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Jan 20, 2022 at 6:06 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > On 1/3/22, 5:52 PM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote: > > It seems to me "LSN" or just "location" is more confusing or > > mysterious than "REDO LSN" for the average user. If we want to avoid > > being technically too detailed, we would use just "start LSN=%X/%X, > > end LSN=%X/%X". And it is equivalent to "WAL range=[%X/%X, %X/%X]".. > > My first instinct was that this should stay aligned with > pg_controldata, but that would mean using "location=%X/%X, REDO > location=%X/%X," which doesn't seem terribly descriptive. IIUC the > "checkpoint location" is the LSN of the WAL record for the checkpoint, > and the "checkpoint's REDO location" is the LSN where checkpoint > creation began (i.e., what you must retain for crash recovery). My > vote is for "start=%X/%X, end=%X/%X." I'm still not clear how the REDO location can be treated as a start LSN? Can someone throw some light one what this checkpoint's REDO location is? Regards, Bharath Rupireddy.
On Thu, Jan 27, 2022 at 08:37:37PM +0530, Bharath Rupireddy wrote: > I'm still not clear how the REDO location can be treated as a start > LSN? Can someone throw some light one what this checkpoint's REDO > location is? It's the WAL insert location at the time the checkpoint began (i.e., where you need to begin replaying WAL from after a crash). -- Nathan Bossart Amazon Web Services: https://aws.amazon.com/
On Thu, Jan 20, 2022 at 8:30 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Thu, 20 Jan 2022 00:36:32 +0000, "Bossart, Nathan" <bossartn@amazon.com> wrote in > > On 1/3/22, 5:52 PM, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote: > > > It seems to me "LSN" or just "location" is more confusing or > > > mysterious than "REDO LSN" for the average user. If we want to avoid > > > being technically too detailed, we would use just "start LSN=%X/%X, > > > end LSN=%X/%X". And it is equivalent to "WAL range=[%X/%X, %X/%X]".. > > > > My first instinct was that this should stay aligned with > > pg_controldata, but that would mean using "location=%X/%X, REDO > > location=%X/%X," which doesn't seem terribly descriptive. IIUC the > > "checkpoint location" is the LSN of the WAL record for the checkpoint, > > and the "checkpoint's REDO location" is the LSN where checkpoint > > creation began (i.e., what you must retain for crash recovery). My > > vote is for "start=%X/%X, end=%X/%X." > > +1. Works for me. %X/%X itself expresses it is an LSN. Thanks for the review. Here's the v4 patch, please have a look. Let's not attempt to change how pg_controldata (tool and core functions) emit the start and end LSN as checkpoint_lsn/redo_lsn and checkpoint location/checkpoint's REDO location. [1] 2022-01-28 03:06:10.213 UTC [2409486] LOG: checkpoint starting: immediate force wait 2022-01-28 03:06:10.257 UTC [2409486] LOG: checkpoint complete: start=0/14D9510, end=0/14D9548; wrote 4 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.007 s, sync=0.008 s, total=0.044 s; sync files=3, longest=0.005 s, average=0.003 s; distance=0 kB, estimate=0 kB 2022-01-28 03:06:42.254 UTC [2409486] LOG: checkpoint starting: immediate force wait 2022-01-28 03:06:42.279 UTC [2409486] LOG: checkpoint complete: start=0/14DBDB8, end=0/14DBDF0; wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s, sync=0.004 s, total=0.025 s; sync files=2, longest=0.003 s, average=0.002 s; distance=10 kB, estimate=10 kB Regards, Bharath Rupireddy.
Attachment
On Fri, Jan 28, 2022 at 08:43:36AM +0530, Bharath Rupireddy wrote: > 2022-01-28 03:06:10.213 UTC [2409486] LOG: checkpoint starting: > immediate force wait > 2022-01-28 03:06:10.257 UTC [2409486] LOG: checkpoint complete: > start=0/14D9510, end=0/14D9548; wrote 4 buffers (0.0%); 0 WAL file(s) > added, 0 removed, 0 recycled; write=0.007 s, sync=0.008 s, total=0.044 > s; sync files=3, longest=0.005 s, average=0.003 s; distance=0 kB, > estimate=0 kB > > 2022-01-28 03:06:42.254 UTC [2409486] LOG: checkpoint starting: > immediate force wait > 2022-01-28 03:06:42.279 UTC [2409486] LOG: checkpoint complete: > start=0/14DBDB8, end=0/14DBDF0; wrote 2 buffers (0.0%); 0 WAL file(s) > added, 0 removed, 0 recycled; write=0.004 s, sync=0.004 s, total=0.025 > s; sync files=2, longest=0.003 s, average=0.002 s; distance=10 kB, > estimate=10 kB I know I voted for "start=%X/%X, end=%X/%X," but looking at this again, I wonder if it could be misleading. "start" is the redo location, and "end" is the location of the checkpoint record, but I could understand why someone might think that "start" is the location of the previous checkpoint record and "end" is the redo location of the new one. I think your original idea of "lsn=%X/%X, redo lsn=%X/%X" could be a good alternative. Іn any case, this patch is small and otherwise looks reasonable to me, so I am going to mark it as ready-for-committer. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com/
On Fri, Jan 28, 2022 at 11:16 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > I know I voted for "start=%X/%X, end=%X/%X," but looking at this again, I > wonder if it could be misleading. "start" is the redo location, and "end" > is the location of the checkpoint record, but I could understand why > someone might think that "start" is the location of the previous checkpoint > record and "end" is the redo location of the new one. I think your > original idea of "lsn=%X/%X, redo lsn=%X/%X" could be a good alternative. > > Іn any case, this patch is small and otherwise looks reasonable to me, so I > am going to mark it as ready-for-committer. Thanks for your review. In summary, we have these options to choose checkpoint LSN and last REDO LSN: 1) "start=%X/%X, end=%X/%X" (ControlFile->checkPointCopy.redo, ControlFile->checkPoint) 2) "lsn=%X/%X, redo lsn=%X/%X" 3) "location=%X/%X, REDO location=%X/%X" -- similar to what pg_controldata and pg_control_checkpoint shows currently. 4) "location=%X/%X, REDO start location=%X/%X" I will leave that decision to the committer. Regards, Bharath Rupireddy.
Greetings, * Bharath Rupireddy (bharath.rupireddyforpostgres@gmail.com) wrote: > On Fri, Jan 28, 2022 at 11:16 AM Nathan Bossart > <nathandbossart@gmail.com> wrote: > > I know I voted for "start=%X/%X, end=%X/%X," but looking at this again, I > > wonder if it could be misleading. "start" is the redo location, and "end" > > is the location of the checkpoint record, but I could understand why > > someone might think that "start" is the location of the previous checkpoint > > record and "end" is the redo location of the new one. I think your > > original idea of "lsn=%X/%X, redo lsn=%X/%X" could be a good alternative. > > > > Іn any case, this patch is small and otherwise looks reasonable to me, so I > > am going to mark it as ready-for-committer. > > Thanks for your review. In summary, we have these options to choose > checkpoint LSN and last REDO LSN: > > 1) "start=%X/%X, end=%X/%X" (ControlFile->checkPointCopy.redo, > ControlFile->checkPoint) > 2) "lsn=%X/%X, redo lsn=%X/%X" > 3) "location=%X/%X, REDO location=%X/%X" -- similar to what > pg_controldata and pg_control_checkpoint shows currently. > 4) "location=%X/%X, REDO start location=%X/%X" > > I will leave that decision to the committer. I'd also vote for #2. Regarding 3 and 4, I'd argue that those should have been changed when we changed a number of other things from the generic 'location' to be 'lsn' in system views and functions, and therefore we should go change those to also specify 'lsn' rather than just saying 'location'. Thanks, Stephen
Attachment
On Mon, Jan 31, 2022 at 8:41 PM Stephen Frost <sfrost@snowman.net> wrote: > > > Thanks for your review. In summary, we have these options to choose > > checkpoint LSN and last REDO LSN: > > > > 1) "start=%X/%X, end=%X/%X" (ControlFile->checkPointCopy.redo, > > ControlFile->checkPoint) > > 2) "lsn=%X/%X, redo lsn=%X/%X" > > 3) "location=%X/%X, REDO location=%X/%X" -- similar to what > > pg_controldata and pg_control_checkpoint shows currently. > > 4) "location=%X/%X, REDO start location=%X/%X" > > > > I will leave that decision to the committer. > > I'd also vote for #2. Regarding 3 and 4, I'd argue that those should > have been changed when we changed a number of other things from the > generic 'location' to be 'lsn' in system views and functions, and > therefore we should go change those to also specify 'lsn' rather than > just saying 'location'. Thanks. Here are 2 patches, 0001 for adding checkpoint lsn and redo lsn in the checkpoint completed message and 0002 for changing the "location" to LSN in pg_controdata's output. With the 0002, pg_control_checkpont, pg_controldata and checkpoint completed message will all be in sync with the checkpoint lsn and redo lsn. Regards, Bharath Rupireddy.
Attachment
On Mon, Jan 31, 2022 at 11:45:19PM +0530, Bharath Rupireddy wrote: > Thanks. Here are 2 patches, 0001 for adding checkpoint lsn and redo > lsn in the checkpoint completed message and 0002 for changing the > "location" to LSN in pg_controdata's output. With the 0002, > pg_control_checkpont, pg_controldata and checkpoint completed message > will all be in sync with the checkpoint lsn and redo lsn. I think the pg_controldata change needs some extra spaces for alignment, but otherwise these patches seem reasonable to me. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Feb 1, 2022 at 12:00 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Jan 31, 2022 at 11:45:19PM +0530, Bharath Rupireddy wrote: > > Thanks. Here are 2 patches, 0001 for adding checkpoint lsn and redo > > lsn in the checkpoint completed message and 0002 for changing the > > "location" to LSN in pg_controdata's output. With the 0002, > > pg_control_checkpont, pg_controldata and checkpoint completed message > > will all be in sync with the checkpoint lsn and redo lsn. > > I think the pg_controldata change needs some extra spaces for alignment, > but otherwise these patches seem reasonable to me. Thanks. My bad it was. Changed in v6. Regards, Bharath Rupireddy.
Attachment
On Tue, Feb 01, 2022 at 12:23:10AM +0530, Bharath Rupireddy wrote: > On Tue, Feb 1, 2022 at 12:00 AM Nathan Bossart <nathandbossart@gmail.com> wrote: >> I think the pg_controldata change needs some extra spaces for alignment, >> but otherwise these patches seem reasonable to me. > > Thanks. My bad it was. Changed in v6. LGTM -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 2022/02/01 10:44, Nathan Bossart wrote: > On Tue, Feb 01, 2022 at 12:23:10AM +0530, Bharath Rupireddy wrote: >> On Tue, Feb 1, 2022 at 12:00 AM Nathan Bossart <nathandbossart@gmail.com> wrote: >>> I think the pg_controldata change needs some extra spaces for alignment, >>> but otherwise these patches seem reasonable to me. >> >> Thanks. My bad it was. Changed in v6. - (errmsg("restartpoint complete: wrote %d buffers (%.1f%%); " + (errmsg("restartpoint complete: lsn=%X/%X, redo lsn=%X/%X; " + "wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " "distance=%d kB, estimate=%d kB", + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo), + LSN_FORMAT_ARGS(ControlFile->checkPoint), The order of arguments for LSN seems wrong. LSN_FORMAT_ARGS(ControlFile->checkPoint) should be specified ahead of LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)? Could you tell me why the information for LSN is reported earlierly in the log message? Since ordinally users would be moreinterested in the information about I/O by checkpoint, the information for LSN should be placed later? Sorry if thiswas already discussed. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Tue, Feb 1, 2022 at 9:10 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > The order of arguments for LSN seems wrong. LSN_FORMAT_ARGS(ControlFile->checkPoint) should be specified ahead of LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)? Thanks. Corrected. > Could you tell me why the information for LSN is reported earlierly in the log message? Since ordinally users would bemore interested in the information about I/O by checkpoint, the information for LSN should be placed later? Sorry if thiswas already discussed. It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN(end) and REDO LSN(start). It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. Regards, Bharath Rupireddy.
Attachment
On 2022/02/01 13:01, Bharath Rupireddy wrote: > On Tue, Feb 1, 2022 at 9:10 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> The order of arguments for LSN seems wrong. LSN_FORMAT_ARGS(ControlFile->checkPoint) should be specified ahead of LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)? > > Thanks. Corrected. Thanks! >> Could you tell me why the information for LSN is reported earlierly in the log message? Since ordinally users would bemore interested in the information about I/O by checkpoint, the information for LSN should be placed later? Sorry if thiswas already discussed. > > It is useful (for debugging purposes) if the checkpoint end message > has the checkpoint LSN(end) and REDO LSN(start). It gives more context > while analyzing checkpoint-related issues. The pg_controldata gives > the last checkpoint LSN and REDO LSN, but having this info alongside > the log message helps analyze issues that happened previously, connect > the dots and identify the root cause. My previous comment was confusing... Probably I understand why you tried to put this information in checkpoint log message.But I was suggesting to put that information at the end of log message instead of the beginning of it. Because ordinaryusers would be less interested in this LSN information than other ones like the number of buffers written. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Tue, Feb 1, 2022 at 9:49 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > My previous comment was confusing... Probably I understand why you tried to put this information in checkpoint log message.But I was suggesting to put that information at the end of log message instead of the beginning of it. Because ordinaryusers would be less interested in this LSN information than other ones like the number of buffers written. Actually, there's no strong reason to put LSN info at the beginning of the message except that LSN/REDO LSN next to the checkpoint/restartpoint complete would make the users understand the LSN and REDO LSN belong to the checkpoint/restartpoint. Since this wasn't a strong reason, I agree to keep it at the end. Modified in v8. [1] 2022-02-01 04:34:17.657 UTC [3597073] LOG: checkpoint complete: wrote 21 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s, sync=0.008 s, total=0.031 s; sync files=18, longest=0.006 s, average=0.001 s; distance=77 kB, estimate=77 kB; lsn=0/14D5AF0, redo lsn=0/14D5AB8 Regards, Bharath Rupireddy.
Attachment
At Tue, 1 Feb 2022 10:08:04 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Tue, Feb 1, 2022 at 9:49 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > My previous comment was confusing... Probably I understand why you tried to put this information in checkpoint log message.But I was suggesting to put that information at the end of log message instead of the beginning of it. Because ordinaryusers would be less interested in this LSN information than other ones like the number of buffers written. > > Actually, there's no strong reason to put LSN info at the beginning of > the message except that LSN/REDO LSN next to the > checkpoint/restartpoint complete would make the users understand the > LSN and REDO LSN belong to the checkpoint/restartpoint. Since this > wasn't a strong reason, I agree to keep it at the end. > > Modified in v8. > > [1] > 2022-02-01 04:34:17.657 UTC [3597073] LOG: checkpoint complete: wrote > 21 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; > write=0.004 s, sync=0.008 s, total=0.031 s; sync files=18, > longest=0.006 s, average=0.001 s; distance=77 kB, estimate=77 kB; > lsn=0/14D5AF0, redo lsn=0/14D5AB8 0001 looks good to me. I tend to agree to 0002. FWIW, I collected other user-facing usage of "location" as LSN. xlog.c:5965, 6128: (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", xlog.c:6718: (errmsg("control file contains invalid checkpoint location"))); xlog.c:6846: (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", xlog.c:6929: (errmsg("could not find redo location referenced by checkpoint record"), xlog.c:11298, 11300: (in backup-label) appendStringInfo(labelfile, "START WAL LOCATION: %X/%X (file %s)\n", appendStringInfo(labelfile, "CHECKPOINT LOCATION: %X/%X\n", (and corresponding reader-code) xlog,c:11791, 11793: (in backup history file) fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n", fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n", (and corresponding reader-code) repl_scanner.l:151: yyerror("invalid streaming start location"); pg_proc.dat: many function descriptions use "location" as LSN. pg_waldump.c:768,777,886,938,1029,1071,1083: printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); pg_log_error("could not parse end WAL location \"%s\"", pg_log_error("could not parse start WAL location \"%s\"", pg_log_error("start WAL location %X/%X is not inside file \"%s\"", pg_log_error("end WAL location %X/%X is not inside file \"%s\"", pg_log_error("no start WAL location given"); pg_basebackup.c:476, 615: (confusing with file/directory path..) pg_log_error("could not parse write-ahead log location \"%s\"", pg_log_error("could not parse write-ahead log location \"%s\"", pg_rewind.c:346: pg_log_info("servers diverged at WAL location %X/%X on timeline %u", pg_rewind/timeline.c:82: pg_log_error("Expected a write-ahead log switchpoint location."); regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Feb 1, 2022 at 11:58 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > Modified in v8. > > 0001 looks good to me. > > I tend to agree to 0002. Thanks. > FWIW, I collected other user-facing usage of "location" as LSN. > > xlog.c:11298, 11300: (in backup-label) > appendStringInfo(labelfile, "START WAL LOCATION: %X/%X (file %s)\n", > appendStringInfo(labelfile, "CHECKPOINT LOCATION: %X/%X\n", > (and corresponding reader-code) > > xlog,c:11791, 11793: (in backup history file) > fprintf(fp, "START WAL LOCATION: %X/%X (file %s)\n", > fprintf(fp, "STOP WAL LOCATION: %X/%X (file %s)\n", > (and corresponding reader-code) I tried to change the "location" to "lsn" in most of the user-facing messages/text. I refrained from changing the bakup_label file content (above) as it might break many applications/service layer code and it's not good for backward compatibility. Attaching the above changes 0003 (0001 and 0002 remain the same). If the committer doesn't agree on the text or wording in 0003, I would like the 0001 and 0002 to be taken here and I can start a new thread for discussing 0003 separately. Regards, Bharath Rupireddy.
Attachment
On 2022/02/01 22:03, Bharath Rupireddy wrote: > On Tue, Feb 1, 2022 at 11:58 AM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: >>> Modified in v8. >> >> 0001 looks good to me. I found that CreateRestartPoint() already reported the redo lsn as follows after emitting the restartpoint log message. Toavoid duplicated logging of the same information, we should update this code? ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", LSN_FORMAT_ARGS(lastCheckPoint.redo)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); This code reports lastCheckPoint.redo as redo lsn. OTOH, with the patch, LogCheckpointEnd() reports ControlFile->checkPointCopy.redo.They may be different, for example, when the current DB state is not DB_IN_ARCHIVE_RECOVERY.In this case, which lsn should we report as redo lsn? + "lsn=%X/%X, redo lsn=%X/%X", Originally you proposed to use upper cases for "lsn". But the latest patch uses the lower cases. Why? It seems better touse upper cases, i.e., LSN and REDO LSN because LSN is basically used in other errmsg(). > Attaching the above changes 0003 (0001 and 0002 remain the same). If > the committer doesn't agree on the text or wording in 0003, I would > like the 0001 and 0002 to be taken here and I can start a new thread > for discussing 0003 separately. Personally I'm ok with 001, but regarding 0002 and 0003 patches, I'm not sure if it's really worth replacing "location" with"lsn" there. BTW, the similar idea was proposed at [1] before, but seems "location" was left as it was. [1] https://postgr.es/m/20487.1494514594@sss.pgh.pa.us Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Greetings, * Fujii Masao (masao.fujii@oss.nttdata.com) wrote: > On 2022/02/01 22:03, Bharath Rupireddy wrote: > >On Tue, Feb 1, 2022 at 11:58 AM Kyotaro Horiguchi > ><horikyota.ntt@gmail.com> wrote: > >>>Modified in v8. > >> > >>0001 looks good to me. > > I found that CreateRestartPoint() already reported the redo lsn as follows after emitting the restartpoint log message.To avoid duplicated logging of the same information, we should update this code? > > ereport((log_checkpoints ? LOG : DEBUG2), > (errmsg("recovery restart point at %X/%X", > LSN_FORMAT_ARGS(lastCheckPoint.redo)), > xtime ? errdetail("Last completed transaction was at log time %s.", > timestamptz_to_str(xtime)) : 0)); > > This code reports lastCheckPoint.redo as redo lsn. OTOH, with the patch, LogCheckpointEnd() reports ControlFile->checkPointCopy.redo.They may be different, for example, when the current DB state is not DB_IN_ARCHIVE_RECOVERY.In this case, which lsn should we report as redo lsn? > > + "lsn=%X/%X, redo lsn=%X/%X", > > Originally you proposed to use upper cases for "lsn". But the latest patch uses the lower cases. Why? It seems better touse upper cases, i.e., LSN and REDO LSN because LSN is basically used in other errmsg(). We do use 'lsn=' quite a bit in verify_nbtree.c already and lowercase is also what's in the various functions and views in the catalog in the database, of course. I don't see even one usage of "LSN=" in the tree today. We also use 'lsn %X/%X' in replorigindesc.c, xactdesc.c, xactdesc.c, tablesync.c, standby.c, parsexlog.c, then 'redo %X/%X' in xactdesc.c. xlog.c does have a number of "WAL location (LSN)", along with a bunch of other usages. logical.c uses both "LSN" and "lsn". worker.c uses "LSN". slot.c uses "restart_lsn". pg_rewind.c uses "WAL location" while pg_waldump.c uses, "lsn:", "WAL location", and "WAL record". Overall, we don't seem to be super consistent, but I'd say that 'lsn=' looks the best, to my eyes anyway, and isn't out of place in the code base. Lowercase seems to generally be more common overall. > >Attaching the above changes 0003 (0001 and 0002 remain the same). If > >the committer doesn't agree on the text or wording in 0003, I would > >like the 0001 and 0002 to be taken here and I can start a new thread > >for discussing 0003 separately. > > Personally I'm ok with 001, but regarding 0002 and 0003 patches, I'm not sure if it's really worth replacing "location"with "lsn" there. BTW, the similar idea was proposed at [1] before, but seems "location" was left as it was. > > [1] > https://postgr.es/m/20487.1494514594@sss.pgh.pa.us This discussion strikes me as sufficient reason to make the change, with the prior comment not really having all that much weight. When we're actually pretty consistent with one term, having random places where we are inconsistent leads people to be unsure about which way to go and then we end up having to have this discussion. Would be great to avoid having to have it again in the future. Thanks, Stephen
Attachment
On Tue, Feb 1, 2022 at 9:39 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > I found that CreateRestartPoint() already reported the redo lsn as follows after emitting the restartpoint log message.To avoid duplicated logging of the same information, we should update this code? > > ereport((log_checkpoints ? LOG : DEBUG2), > (errmsg("recovery restart point at %X/%X", > LSN_FORMAT_ARGS(lastCheckPoint.redo)), > xtime ? errdetail("Last completed transaction was at log time %s.", > timestamptz_to_str(xtime)) : 0)); > > This code reports lastCheckPoint.redo as redo lsn. OTOH, with the patch, LogCheckpointEnd() reports ControlFile->checkPointCopy.redo.They may be different, for example, when the current DB state is not DB_IN_ARCHIVE_RECOVERY.In this case, which lsn should we report as redo lsn? Do we ever reach CreateRestartPoint when ControlFile->stat != DB_IN_ARCHIVE_RECOVERY? Assert(ControlFile->state == DB_IN_ARCHIVE_RECOVERY); in CreateRestartPoint doesn't fail any regression tests. Here's what can happen: lastCheckPoint.redo is 100 and ControlFile->checkPointCopy.redo is 105, so, "skipping restartpoint, already performed at %X/%X" LogCheckpointEnd isn't reached lastCheckPoint.redo is 105 and ControlFile->checkPointCopy.redo is 100 and ControlFile->state == DB_IN_ARCHIVE_RECOVERY, then the control file gets updated and LogCheckpointEnd prints the right redo lsn. lastCheckPoint.redo is 105 and ControlFile->checkPointCopy.redo is 100 and ControlFile->state != DB_IN_ARCHIVE_RECOVERY, the the control file doesn't get updated and LogCheckpointEnd just prints the control redo lsn. Looks like this case is rare given Assert(ControlFile->state == DB_IN_ARCHIVE_RECOVERY); doesn't fail any tests. I think we should just let LogCheckpointEnd print the redo lsn from the control file. We can just remove the above errmsg("recovery restart point at %X/%X" message altogether or just print it only in the rare scenario, something like below: if (ControlFile->state != DB_IN_ARCHIVE_RECOVERY) { ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("performed recovery restart point at %X/%X while the database state is %s", LSN_FORMAT_ARGS(lastCheckPoint.redo)), getDBState(ControlFile->state))); } And the last commit/abort records's timestamp will always get logged even before we reach here in the main redo loop (errmsg("last completed transaction was at log time %s"). Or another way is to just pass the redo lsn to LogCheckpointEnd and pass the lastCheckPoint.redo in if (ControlFile->state != DB_IN_ARCHIVE_RECOVERY) cases or when control file isn't updated but restart point happened. Thoughts? Regards, Bharath Rupireddy.
On 2022/02/02 23:46, Bharath Rupireddy wrote: > On Tue, Feb 1, 2022 at 9:39 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >> I found that CreateRestartPoint() already reported the redo lsn as follows after emitting the restartpoint log message.To avoid duplicated logging of the same information, we should update this code? >> >> ereport((log_checkpoints ? LOG : DEBUG2), >> (errmsg("recovery restart point at %X/%X", >> LSN_FORMAT_ARGS(lastCheckPoint.redo)), >> xtime ? errdetail("Last completed transaction was at log time %s.", >> timestamptz_to_str(xtime)) : 0)); >> >> This code reports lastCheckPoint.redo as redo lsn. OTOH, with the patch, LogCheckpointEnd() reports ControlFile->checkPointCopy.redo.They may be different, for example, when the current DB state is not DB_IN_ARCHIVE_RECOVERY.In this case, which lsn should we report as redo lsn? > > Do we ever reach CreateRestartPoint when ControlFile->stat != > DB_IN_ARCHIVE_RECOVERY? Assert(ControlFile->state == > DB_IN_ARCHIVE_RECOVERY); in CreateRestartPoint doesn't fail any > regression tests. ISTM that CreateRestartPoint() can reach the condition ControlFile->state != DB_IN_ARCHIVE_RECOVERY. Please imagine the casewhere CreateRestartPoint() has already started and calls CheckPointGuts(). If the standby server is promoted while CreateRestartPoint()is flushing the data to disk at CheckPointGuts(), the state would be updated to DB_IN_PRODUCTION andCreateRestartPoint() can see the state != DB_IN_ARCHIVE_RECOVERY later. As far as I read the code, this case seems to be able to make the server unrecoverable. If this case happens, since pg_controlis not updated, pg_control still indicates the REDO LSN of last valid restartpoint. But CreateRestartPoint() seemsto delete old WAL files based on its "current" REDO LSN not pg_control's REDO LSN. That is, WAL files required for thecrash recovery starting from pg_control's REDO LSN would be removed. If this understanding is right, to address this issue, probably we need to make CreateRestartPoint() do nothing (return immediately)when the state != DB_IN_ARCHIVE_RECOVERY? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Thu, 3 Feb 2022 13:59:03 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2022/02/02 23:46, Bharath Rupireddy wrote: > > On Tue, Feb 1, 2022 at 9:39 PM Fujii Masao > > <masao.fujii@oss.nttdata.com> wrote: > >> I found that CreateRestartPoint() already reported the redo lsn as > >> follows after emitting the restartpoint log message. To avoid > >> duplicated logging of the same information, we should update this > >> code? > >> > >> ereport((log_checkpoints ? LOG : DEBUG2), > >> (errmsg("recovery restart point at %X/%X", > >> LSN_FORMAT_ARGS(lastCheckPoint.redo)), > >> xtime ? errdetail("Last completed transaction was > >> at log time %s.", > >> timestamptz_to_str(xtime)) > >> : 0)); > >> > >> This code reports lastCheckPoint.redo as redo lsn. OTOH, with the > >> patch, LogCheckpointEnd() reports > >> ControlFile->checkPointCopy.redo. They may be different, for example, > >> when the current DB state is not DB_IN_ARCHIVE_RECOVERY. In this case, > >> which lsn should we report as redo lsn? > > Do we ever reach CreateRestartPoint when ControlFile->stat != > > DB_IN_ARCHIVE_RECOVERY? Assert(ControlFile->state == > > DB_IN_ARCHIVE_RECOVERY); in CreateRestartPoint doesn't fail any > > regression tests. > > ISTM that CreateRestartPoint() can reach the condition > ControlFile->state != DB_IN_ARCHIVE_RECOVERY. Please imagine the case > where CreateRestartPoint() has already started and calls > CheckPointGuts(). If the standby server is promoted while > CreateRestartPoint() is flushing the data to disk at CheckPointGuts(), > the state would be updated to DB_IN_PRODUCTION and > CreateRestartPoint() can see the state != DB_IN_ARCHIVE_RECOVERY > later. By the way a comment there: > * this is a quick hack to make sure nothing really bad happens if somehow > * we get here after the end-of-recovery checkpoint. now looks a bit wrong since now it's normal that a restartpoint ends after promotion. > As far as I read the code, this case seems to be able to make the > server unrecoverable. If this case happens, since pg_control is not > updated, pg_control still indicates the REDO LSN of last valid > restartpoint. But CreateRestartPoint() seems to delete old WAL files > based on its "current" REDO LSN not pg_control's REDO LSN. That is, > WAL files required for the crash recovery starting from pg_control's > REDO LSN would be removed. Seems right. (I didn't confirm the behavior, though ) > If this understanding is right, to address this issue, probably we > need to make CreateRestartPoint() do nothing (return immediately) when > the state != DB_IN_ARCHIVE_RECOVERY? On way to take. In that case should we log something like "Restartpoint canceled" or something? By the way, restart point should start only while recoverying, and at the timeof the start both checkpoint.redo and checkpoint LSN are already past. We shouldn't update minRecovery point after promotion, but is there any reason for not updating the checkPoint and checkPointCopy? If we update them after promotion, the which-LSN-to-show problem would be gone. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2022/02/03 15:50, Kyotaro Horiguchi wrote: > On way to take. In that case should we log something like > "Restartpoint canceled" or something? +1 > By the way, restart point should start only while recoverying, and at > the timeof the start both checkpoint.redo and checkpoint LSN are > already past. We shouldn't update minRecovery point after promotion, > but is there any reason for not updating the checkPoint and > checkPointCopy? If we update them after promotion, the > which-LSN-to-show problem would be gone. I tried to find the reason by reading the past discussion, but have not found that yet. If we update checkpoint and REDO LSN at pg_control in that case, we also need to update min recovery point at pg_control?Otherwise the min recovery point at pg_control still indicates the old LSN that previous restart point set. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Fri, 4 Feb 2022 10:59:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2022/02/03 15:50, Kyotaro Horiguchi wrote: > > On way to take. In that case should we log something like > > "Restartpoint canceled" or something? > > +1 > > > > By the way, restart point should start only while recoverying, and at > > the timeof the start both checkpoint.redo and checkpoint LSN are > > already past. We shouldn't update minRecovery point after promotion, > > but is there any reason for not updating the checkPoint and > > checkPointCopy? If we update them after promotion, the > > which-LSN-to-show problem would be gone. > > I tried to find the reason by reading the past discussion, but have > not found that yet. > > If we update checkpoint and REDO LSN at pg_control in that case, we > also need to update min recovery point at pg_control? Otherwise the > min recovery point at pg_control still indicates the old LSN that > previous restart point set. I had an assuption that the reason I think it shouldn't update minRecoveryPoint is that it has been or is going to be reset to invalid LSN by promotion and the checkpoint should refrain from touching it. -- Kyotaro Horiguchi NTT Open Source Software Center
At Mon, 07 Feb 2022 10:16:34 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Fri, 4 Feb 2022 10:59:04 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > On 2022/02/03 15:50, Kyotaro Horiguchi wrote: > > > By the way, restart point should start only while recoverying, and at > > > the timeof the start both checkpoint.redo and checkpoint LSN are > > > already past. We shouldn't update minRecovery point after promotion, > > > but is there any reason for not updating the checkPoint and > > > checkPointCopy? If we update them after promotion, the > > > which-LSN-to-show problem would be gone. > > > > I tried to find the reason by reading the past discussion, but have > > not found that yet. > > > > If we update checkpoint and REDO LSN at pg_control in that case, we > > also need to update min recovery point at pg_control? Otherwise the > > min recovery point at pg_control still indicates the old LSN that > > previous restart point set. > > I had an assuption that the reason I think it shouldn't update > minRecoveryPoint is that it has been or is going to be reset to > invalid LSN by promotion and the checkpoint should refrain from > touching it. Hmm.. It doesn't seem to be the case. If a server crashes just after promotion and before requesting post-promtion checkpoint, minRecoveryPoint stays at a valid LSN. (Promoted at 0/7000028) Database cluster state: in production Latest checkpoint location: 0/6000060 Latest checkpoint's REDO location: 0/6000028 Latest checkpoint's REDO WAL file: 000000010000000000000006 Minimum recovery ending location: 0/7000090 Min recovery ending loc's timeline: 2 minRecoveryPoint/TLI are ignored in any case where a server in in-production state is started. In other words, the values are useless. There's no clear or written reason for unrecording the last ongoing restartpoint after promotion. Before fast-promotion was introduced, we shouldn't get there after end-of-recovery checkpoint (but somehow reached sometimes?) but it is quite normal nowadays. Or to the contrary, we're expecting it to happen and it is regarded as a normal checkponit. So we should do there nowadays are as the follows. - If any later checkpoint/restartpoint has been established, just skip remaining task then return false. (!chkpt_was_latest) (I'm not sure this can happen, though.) - we update control file only when archive recovery is still ongoing. - Otherwise reset minRecoveryPoint then continue. Do you have any thoughts or opinions? regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 958220c495..ab8a4d9a1b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9658,6 +9658,7 @@ CreateRestartPoint(int flags) XLogRecPtr endptr; XLogSegNo _logSegNo; TimestampTz xtime; + bool chkpt_was_latest = false; /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); @@ -9752,44 +9753,69 @@ CreateRestartPoint(int flags) PriorRedoPtr = ControlFile->checkPointCopy.redo; /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Update pg_control, using current time if no later checkpoints have been + * performed. */ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->checkPointCopy.redo < lastCheckPoint.redo) { + chkpt_was_latest = true; ControlFile->checkPoint = lastCheckPointRecPtr; ControlFile->checkPointCopy = lastCheckPoint; /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened + * already while writing out dirty buffers, but not necessarily - + * e.g. because no buffers were dirtied. We do this because a + * non-exclusive base backup uses minRecoveryPoint to determine which + * WAL files must be included in the backup, and the file (or files) + * containing the checkpoint record must be included, at a + * minimum. Note that for an ordinary restart of recovery there's no + * value in having the minimum recovery point any earlier than this * anyway, because redo will begin just after the checkpoint record. */ - if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->minRecoveryPoint = lastCheckPointEndPtr; - ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID; + if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) + { + ControlFile->minRecoveryPoint = lastCheckPointEndPtr; + ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID; - /* update local copy */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + /* update local copy */ + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + } + if (flags & CHECKPOINT_IS_SHUTDOWN) + ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; + } + else + { + /* + * We have exited from archive-recovery mode after starting. Crash + * recovery ever after should always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; } - if (flags & CHECKPOINT_IS_SHUTDOWN) - ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; UpdateControlFile(); } LWLockRelease(ControlFileLock); + /* + * Skip all post-checkpoint work if others have done that with later + * checkpoints. + */ + if (!chkpt_was_latest) + { + ereport((log_checkpoints ? LOG : DEBUG2), + (errmsg("post-restartpoint cleanup is skpped at %X/%X, because later restartpoints have been already performed", + LSN_FORMAT_ARGS(lastCheckPoint.redo)))); + + /* this checkpoint has not bee established */ + return false; + } + /* * Update the average distance between checkpoints/restartpoints if the * prior checkpoint exists.
On 2022/02/07 12:02, Kyotaro Horiguchi wrote: > - If any later checkpoint/restartpoint has been established, just skip > remaining task then return false. (!chkpt_was_latest) > (I'm not sure this can happen, though.) > > - we update control file only when archive recovery is still ongoing. This comment seems to conflict with what your PoC patch does. Because with the patch, ControlFile->checkPoint and ControlFile->checkPointCopyseem to be updated even when ControlFile->state != DB_IN_ARCHIVE_RECOVERY. I agree with what your PoC patch does for now. When we're not in archive recovery state, checkpoint and REDO locations inpg_control should be updated but min recovery point should be reset to invalid one (which instruments that subsequent crashrecovery should replay all available WAL files). > - Otherwise reset minRecoveryPoint then continue. > > Do you have any thoughts or opinions? Regarding chkpt_was_latest, whether the state is DB_IN_ARCHIVE_RECOVERY or not, if checkpoint and redo locations in pg_controlare updated, IMO we don't need to skip the "remaining tasks". Since those locations are updated and subsequentcrash recovery will start from that redo location, for example, ISTM that we can safely delete old WAL files priorto the redo location as the "remaining tasks". Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Mon, 7 Feb 2022 13:51:31 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2022/02/07 12:02, Kyotaro Horiguchi wrote: > > - If any later checkpoint/restartpoint has been established, just skip > > remaining task then return false. (!chkpt_was_latest) > > (I'm not sure this can happen, though.) > > - we update control file only when archive recovery is still ongoing. > > This comment seems to conflict with what your PoC patch does. Because > with the patch, ControlFile->checkPoint and > ControlFile->checkPointCopy seem to be updated even when > ControlFile->state != DB_IN_ARCHIVE_RECOVERY. Ah, yeah, by "update" I meant that "move forward". Sorry for confusing wording. > I agree with what your PoC patch does for now. When we're not in > archive recovery state, checkpoint and REDO locations in pg_control > should be updated but min recovery point should be reset to invalid > one (which instruments that subsequent crash recovery should replay > all available WAL files). Yes. All buffers before the last recovery point's end have been flushed out so the recovery point is valid as a checkpoint. On the other hand minRecoveryPoint is no longer needed and actually is ignored at the next crash recovery. We can leave it alone but it is consistent that it is cleared. > > - Otherwise reset minRecoveryPoint then continue. > > Do you have any thoughts or opinions? > > Regarding chkpt_was_latest, whether the state is > DB_IN_ARCHIVE_RECOVERY or not, if checkpoint and redo locations in > pg_control are updated, IMO we don't need to skip the "remaining > tasks". Since those locations are updated and subsequent crash > recovery will start from that redo location, for example, ISTM that we > can safely delete old WAL files prior to the redo location as the > "remaining tasks". Thought? If I read you correctly, the PoC works that way. It updates pg_control if the restart point is latest then performs the remaining cleanup tasks in that case. Recovery state doesn't affect this process. I reexamined about the possibility of concurrent checkpoints. Both CreateCheckPoint and CreateRestartPoint are called from checkpointer loop, shutdown handler of checkpointer and standalone process. So I can't see a possibility of concurrent checkpoints. In the past we had a time when startup process called CreateCheckPoint directly in the crash recovery case where checkpoint is not running but since 7ff23c6d27 checkpoint is started before startup process starts. So I conclude that that cannot happen. So the attached takes away the path for the case where the restart point is overtaken by a concurrent checkpoint. Thus.. the attached removes the ambiguity of of the proposed patch about the LSNs in the restartpoint-ending log message. Thoughts? -- Kyotaro Horiguchi NTT Open Source Software Center From 43888869846b6de00fbddb215300a8ff774bbc04 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Tue, 8 Feb 2022 16:42:53 +0900 Subject: [PATCH v1] Get rid of unused path to handle concurrent checkpoints CreateRestartPoint considered the case a concurrent checkpoint is running. But after 7ff23c6d27 we no longer launch multiple checkpoints simultaneously. That code path, if it is passed, may leave unrecoverable database by removing WAL segments that are required by the last established restartpoint. --- src/backend/access/transam/xlog.c | 53 +++++++++++++++++++------------ 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 958220c495..01a345e8bd 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9752,29 +9752,30 @@ CreateRestartPoint(int flags) PriorRedoPtr = ControlFile->checkPointCopy.redo; /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Update pg_control, using current time. */ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + + /* We mustn't have a concurrent checkpoint that advances checkpoint LSN */ + Assert(lastCheckPoint.redo > ControlFile->checkPointCopy.redo); + + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + + /* + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. + */ + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9786,8 +9787,18 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* + * We have exited from archive-recovery mode after this restartpoint + * started. Crash recovery ever after should always recover to the end + * of WAL. + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + } + UpdateControlFile(); LWLockRelease(ControlFileLock); /* -- 2.27.0
Mmm.. checkpoint and checkpointer are quite confusing in this context.. At Tue, 08 Feb 2022 16:58:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Mon, 7 Feb 2022 13:51:31 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > > > > On 2022/02/07 12:02, Kyotaro Horiguchi wrote: > > > - If any later checkpoint/restartpoint has been established, just skip > > > remaining task then return false. (!chkpt_was_latest) > > > (I'm not sure this can happen, though.) > > > - we update control file only when archive recovery is still ongoing. > > > > This comment seems to conflict with what your PoC patch does. Because > > with the patch, ControlFile->checkPoint and > > ControlFile->checkPointCopy seem to be updated even when > > ControlFile->state != DB_IN_ARCHIVE_RECOVERY. > > Ah, yeah, by "update" I meant that "move forward". Sorry for confusing > wording. Please ignore the "that". > > I agree with what your PoC patch does for now. When we're not in > > archive recovery state, checkpoint and REDO locations in pg_control > > should be updated but min recovery point should be reset to invalid > > one (which instruments that subsequent crash recovery should replay > > all available WAL files). > > Yes. All buffers before the last recovery point's end have been > flushed out so the recovery point is valid as a checkpoint. On the > other hand minRecoveryPoint is no longer needed and actually is > ignored at the next crash recovery. We can leave it alone but it is > consistent that it is cleared. > > > > - Otherwise reset minRecoveryPoint then continue. > > > Do you have any thoughts or opinions? > > > > Regarding chkpt_was_latest, whether the state is > > DB_IN_ARCHIVE_RECOVERY or not, if checkpoint and redo locations in > > pg_control are updated, IMO we don't need to skip the "remaining > > tasks". Since those locations are updated and subsequent crash > > recovery will start from that redo location, for example, ISTM that we > > can safely delete old WAL files prior to the redo location as the > > "remaining tasks". Thought? > > If I read you correctly, the PoC works that way. It updates pg_control > if the restart point is latest then performs the remaining cleanup > tasks in that case. Recovery state doesn't affect this process. > > I reexamined about the possibility of concurrent checkpoints. > > Both CreateCheckPoint and CreateRestartPoint are called from > checkpointer loop, shutdown handler of checkpointer and standalone > process. So I can't see a possibility of concurrent checkpoints. > > In the past we had a time when startup process called CreateCheckPoint - directly in the crash recovery case where checkpoint is not running - but since 7ff23c6d27 checkpoint is started before startup process + directly in the crash recovery case where checkpointer is not running + but since 7ff23c6d27 checkpointer is launched before startup process > starts. So I conclude that that cannot happen. > > So the attached takes away the path for the case where the restart > point is overtaken by a concurrent checkpoint. > > Thus.. the attached removes the ambiguity of of the proposed patch > about the LSNs in the restartpoint-ending log message. Thoughts? -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Feb 8, 2022 at 2:10 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > Thus.. the attached removes the ambiguity of of the proposed patch > > about the LSNs in the restartpoint-ending log message. > > Thoughts? Thanks for the patch. I have few comments on the v1-0001-Get-rid-of-unused-path-to-handle-concurrent-check.patch 1) Can we have this Assert right after "skipping restartpoint, already performed at %X/%X" error message block? Does it make any difference? My point is that if at all, we were to assert this, why can't we do it before CheckPointGuts? + /* We mustn't have a concurrent checkpoint that advances checkpoint LSN */ + Assert(lastCheckPoint.redo > ControlFile->checkPointCopy.redo); + 2) Related to the above Assert, do we really need an assertion or a FATAL error? 3) Let's be consistent with "crash recovery" - replace "archive-recovery" with "archive recovery"? + * We have exited from archive-recovery mode after this restartpoint + * started. Crash recovery ever after should always recover to the end 4) Isn't it enough to say "Crash recovery should always recover to the end of WAL."? + * started. Crash recovery ever after should always recover to the end 5) Is there a reliable test case covering this code? Please point me if the test case is shared upthread somewhere. 6) So, with this patch, the v8 patch-set posted at [1] doesn't need any changes IIUC. If that's the case, please feel free to post all the patches together such that they get tested in cfbot. [1] - https://www.postgresql.org/message-id/CALj2ACUtZhTb%3D2ENkF3BQ3wi137uaGi__qzvXC-qFYC0XwjALw%40mail.gmail.com Regards, Bharath Rupireddy.
Hi, Bharath. At Tue, 8 Feb 2022 14:33:01 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Tue, Feb 8, 2022 at 2:10 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > > Thus.. the attached removes the ambiguity of of the proposed patch > > > about the LSNs in the restartpoint-ending log message. > > > > Thoughts? > > Thanks for the patch. I have few comments on the > v1-0001-Get-rid-of-unused-path-to-handle-concurrent-check.patch Thanks for checking this! > 1) Can we have this Assert right after "skipping restartpoint, already > performed at %X/%X" error message block? Does it make any difference? > My point is that if at all, we were to assert this, why can't we do it > before CheckPointGuts? > + /* We mustn't have a concurrent checkpoint that advances checkpoint LSN */ > + Assert(lastCheckPoint.redo > ControlFile->checkPointCopy.redo); No. The assertion checks if something wrong has happend while CheckPointGuts - which may take a long time - is running. If we need to do that, it should be after CheckPointGuts. (However, I removed it finally. See below.) > 2) Related to the above Assert, do we really need an assertion or a FATAL error? It's just to make sure in case where that happens by any chance in future. But on second thought, as I mentioned, CreateRestartPoint is called from standalone process or from checkpointer. I added that assertion at the beginning of CreateRestartPoint. I think that assertion is logically equal to the old assertion. I remember that I felt uncomfortable with the lock-less behavior on ControlFile, which makes the code a bit complex to read. So I moved "PriorRedoPtr = " line to within the lock section just below. Addition to that, I feel being confused by the parallel-use of lastCheckPoint.redo and RedoRecPtr So I replaced lastCheckPoint.redo after assiging the former to the latter with RedoRecPtr. > 3) Let's be consistent with "crash recovery" - replace > "archive-recovery" with "archive recovery"? > + * We have exited from archive-recovery mode after this restartpoint > + * started. Crash recovery ever after should always recover to the end That's sensible. I found several existing use of archive-recovery in xlog.c and a few other files but the fix for them is separated as another patch (0005). > 4) Isn't it enough to say "Crash recovery should always recover to the > end of WAL."? > + * started. Crash recovery ever after should always recover to the end I think we need to explicitly say something like "we have exited archive recovery while *this* restartpoint is running". I simplified the sentence as the follows. + * Archive recovery have ended. Crash recovery ever after should + * always recover to the end of WAL. > 5) Is there a reliable test case covering this code? Please point me > if the test case is shared upthread somewhere. Nothing. Looking from the opposite side, the existing code for the competing restartpoint/checkpoint case should have not been exercised at least for these several major versions. Instead, I added an assertion at the beginning of CreateRestartPoint that asserts that "this should be called only under standalone process or from checkpointer.". If that assert doesn't fire while the whole test, it should be the proof of the premise for this patch is correct. > 6) So, with this patch, the v8 patch-set posted at [1] doesn't need > any changes IIUC. If that's the case, please feel free to post all the > patches together such that they get tested in cfbot. The two are different fixes so I don't think they are ought to be merged together. > [1] - https://www.postgresql.org/message-id/CALj2ACUtZhTb%3D2ENkF3BQ3wi137uaGi__qzvXC-qFYC0XwjALw%40mail.gmail.com As the old 0001, though I said it'fine :p) I added a comment that reading ControlFile->checkPoint* is safe here. The old 0002 (attached 0003) looks file to me. The old 0003 (attached 0004): +++ b/src/backend/access/rmgrdesc/xlogdesc.c - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " It is shown in the context of a checkpoint record, so I think it is not needed or rather lengthning the dump line uselessly. +++ b/src/backend/access/transam/xlog.c - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", +++ b/src/backend/replication/walsender.c - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", "WAL" is upper-cased. So it seems rather strange that the "lsn" is lower-cased. In the first place the message doesn't look like a user-facing error message and I feel we don't need position or lsn there.. +++ b/src/bin/pg_rewind/pg_rewind.c - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", I feel that we don't need "WAL" there. +++ b/src/bin/pg_waldump/pg_waldump.c - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); Mmm.. "WAL LSN RECPTR" looks strange to me. In the first place I don't think "RECPTR" is a user-facing term. Doesn't something like the follows work? + printf(_(" -e, --end=WAL-LSN stop reading at WAL-LSN\n")); In some changes in this patch shorten the main message text of fprintf-ish functions. That makes the succeeding parameters can be inlined. 0001: The fix of CreateRestartPoint 0002: Add LSNs to checkpoint logs (the main patch) 0003: Change "location" to "LSN" of pg_controldata 0004: The same as 0003 for other uses of "location". 0005: Unhyphnate "-recovery" terms. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 9359549b948e6d3a61552985127a86a0a37e83d7 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Tue, 8 Feb 2022 16:42:53 +0900 Subject: [PATCH v9 1/4] Get rid of unused path to handle concurrent checkpoints CreateRestartPoint considered the case a concurrent checkpoint is running. But 7ff23c6d27 eliminates the possibility that multiple checkpoints run simultaneously. That code path, if it were passed, might leave unrecoverable database by removing WAL segments that are required by the last established restartpoint. In passing, the code in the function gets tightened-up and tidied-up in some points so that it gets easier to read. --- src/backend/access/transam/xlog.c | 64 ++++++++++++++++----------- src/backend/postmaster/checkpointer.c | 1 - 2 files changed, 38 insertions(+), 27 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 958220c495..a6c6e67f80 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9659,6 +9659,9 @@ CreateRestartPoint(int flags) XLogSegNo _logSegNo; TimestampTz xtime; + /* we don't assume concurrent checkpoint/restartpoint to run */ + Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); + /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr; @@ -9724,7 +9727,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9743,7 +9746,12 @@ CreateRestartPoint(int flags) /* Update the process title */ update_checkpoint_display(flags, true, false); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* + * Update pg_control, using current time. + */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -9751,30 +9759,23 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9786,15 +9787,26 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* + * Archive recovery have ended. Crash recovery ever after should always + * recover to the end of WAL. + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + } + UpdateControlFile(); LWLockRelease(ControlFileLock); /* * Update the average distance between checkpoints/restartpoints if the * prior checkpoint exists. */ - if (PriorRedoPtr != InvalidXLogRecPtr) + + /* the second term is just in case */ + if (PriorRedoPtr != InvalidXLogRecPtr || RedoRecPtr > PriorRedoPtr) UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr); /* @@ -9864,7 +9876,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - LSN_FORMAT_ARGS(lastCheckPoint.redo)), + LSN_FORMAT_ARGS(RedoRecPtr)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index 23f691cd47..8bd4c47b6b 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -386,7 +386,6 @@ CheckpointerMain(void) /* Check if we should perform a checkpoint or a restartpoint. */ do_restartpoint = RecoveryInProgress(); - /* * Atomically fetch the request flags to figure out what kind of a * checkpoint we should perform, and increase the started-counter -- 2.27.0 From 0a8f18e4c13d67d246e39781fadc1717b4f8dcf4 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 04:34:54 +0000 Subject: [PATCH v9 2/4] Add checkpoint and redo LSN to LogCheckpointEnd log message It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN(end) and REDO LSN(start). It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. --- src/backend/access/transam/xlog.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index a6c6e67f80..60da34433b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8923,7 +8923,8 @@ LogCheckpointEnd(bool restartpoint) "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -8936,14 +8937,18 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); else ereport(LOG, (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -8956,7 +8961,10 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); } /* -- 2.27.0 From 43d2127235f6d7f58f1daeca9f78afb86bd266f3 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 03:57:04 +0000 Subject: [PATCH v9 3/4] Change "location" to "lsn" in pg_controldata --- src/bin/pg_controldata/pg_controldata.c | 10 +++++----- src/test/recovery/t/016_min_consistency.pl | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f911f98d94..59f39267df 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -235,9 +235,9 @@ main(int argc, char *argv[]) dbState(ControlFile->state)); printf(_("pg_control last modified: %s\n"), pgctime_str); - printf(_("Latest checkpoint location: %X/%X\n"), + printf(_("Latest checkpoint LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPoint)); - printf(_("Latest checkpoint's REDO location: %X/%X\n"), + printf(_("Latest checkpoint's REDO LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)); printf(_("Latest checkpoint's REDO WAL file: %s\n"), xlogfilename); @@ -274,13 +274,13 @@ main(int argc, char *argv[]) ckpttime_str); printf(_("Fake LSN counter for unlogged rels: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->unloggedLSN)); - printf(_("Minimum recovery ending location: %X/%X\n"), + printf(_("Minimum recovery ending LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint)); printf(_("Min recovery ending loc's timeline: %u\n"), ControlFile->minRecoveryPointTLI); - printf(_("Backup start location: %X/%X\n"), + printf(_("Backup start LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupStartPoint)); - printf(_("Backup end location: %X/%X\n"), + printf(_("Backup end LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupEndPoint)); printf(_("End-of-backup record required: %s\n"), ControlFile->backupEndRequired ? _("yes") : _("no")); diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl index 86fd6f5546..7ab4d4429d 100644 --- a/src/test/recovery/t/016_min_consistency.pl +++ b/src/test/recovery/t/016_min_consistency.pl @@ -126,7 +126,7 @@ my @control_data = split("\n", $stdout); my $offline_recovery_lsn = undef; foreach (@control_data) { - if ($_ =~ /^Minimum recovery ending location:\s*(.*)$/mg) + if ($_ =~ /^Minimum recovery ending LSN:\s*(.*)$/mg) { $offline_recovery_lsn = $1; last; -- 2.27.0 From bb033ba1747a92865afebeff2d48c2ba60c6ab05 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 12:36:34 +0000 Subject: [PATCH v9 4/4] Change "location" to "lsn" in user facing text --- src/backend/access/rmgrdesc/xlogdesc.c | 2 +- src/backend/access/transam/xlog.c | 12 ++++++------ src/backend/replication/repl_scanner.l | 2 +- src/backend/replication/walsender.c | 2 +- src/bin/pg_basebackup/pg_basebackup.c | 6 +++--- src/bin/pg_basebackup/pg_receivewal.c | 2 +- src/bin/pg_basebackup/pg_recvlogical.c | 4 ++-- src/bin/pg_basebackup/streamutil.c | 2 +- src/bin/pg_rewind/pg_rewind.c | 2 +- src/bin/pg_rewind/timeline.c | 2 +- src/bin/pg_waldump/pg_waldump.c | 14 +++++++------- src/include/catalog/pg_proc.dat | 16 ++++++++-------- 12 files changed, 33 insertions(+), 33 deletions(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index e7452af679..a7ffbd7590 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -44,7 +44,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record) { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; " "oldest xid %u in DB %u; oldest multi %u in DB %u; " "oldest/newest commit timestamp xid: %u/%u; " diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 60da34433b..acc5ccf174 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1858,7 +1858,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) if (upto > reservedUpto) { ereport(LOG, - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto)))); upto = reservedUpto; } @@ -5962,7 +5962,7 @@ recoveryStopsBefore(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping before WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -6125,7 +6125,7 @@ recoveryStopsAfter(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping after WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping after WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -6715,7 +6715,7 @@ StartupXLOG(void) */ if (!XRecOffIsValid(ControlFile->checkPoint)) ereport(FATAL, - (errmsg("control file contains invalid checkpoint location"))); + (errmsg("control file contains invalid checkpoint LSN"))); switch (ControlFile->state) { @@ -6843,7 +6843,7 @@ StartupXLOG(void) recoveryTargetName))); else if (recoveryTarget == RECOVERY_TARGET_LSN) ereport(LOG, - (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", + (errmsg("starting point-in-time recovery to WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryTargetLSN)))); else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE) ereport(LOG, @@ -6926,7 +6926,7 @@ StartupXLOG(void) if (!ReadRecord(xlogreader, LOG, false, checkPoint.ThisTimeLineID)) ereport(FATAL, - (errmsg("could not find redo location referenced by checkpoint record"), + (errmsg("could not find redo LSN referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add requiredrecovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoringfrom a backup.", diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index d992bcc2e3..f1a435bc22 100644 --- a/src/backend/replication/repl_scanner.l +++ b/src/backend/replication/repl_scanner.l @@ -148,7 +148,7 @@ MANIFEST_CHECKSUMS { return K_MANIFEST_CHECKSUMS; } uint32 hi, lo; if (sscanf(yytext, "%X/%X", &hi, &lo) != 2) - yyerror("invalid streaming start location"); + yyerror("invalid streaming start LSN"); yylval.recptr = ((uint64) hi) << 32 | lo; return RECPTR; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 655760fee3..0c9d86b72c 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -812,7 +812,7 @@ StartReplication(StartReplicationCmd *cmd) if (FlushPtr < cmd->startpoint) { ereport(ERROR, - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", LSN_FORMAT_ARGS(cmd->startpoint), LSN_FORMAT_ARGS(FlushPtr)))); } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index c40925c1f0..99b41fc2a6 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -473,7 +473,7 @@ reached_end_position(XLogRecPtr segendpos, uint32 timeline, if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } @@ -612,7 +612,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) /* Convert the starting position */ if (sscanf(startpos, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", startpos); exit(1); } @@ -2220,7 +2220,7 @@ BaseBackup(void) */ if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index ccb215c398..e8c1631745 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -753,7 +753,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index cc35d16f32..0bfb9edb94 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -781,7 +781,7 @@ main(int argc, char **argv) case 'I': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse start position \"%s\"", optarg); + pg_log_error("could not parse start LSN \"%s\"", optarg); exit(1); } startpos = ((uint64) hi) << 32 | lo; @@ -789,7 +789,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 4a6afd1a06..41d0473138 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -447,7 +447,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, { if (sscanf(PQgetvalue(res, 0, 2), "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", PQgetvalue(res, 0, 2)); PQclear(res); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index efb82a4034..7e4a1c2c30 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -343,7 +343,7 @@ main(int argc, char **argv) XLogRecPtr chkptendrec; findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", LSN_FORMAT_ARGS(divergerec), targetHistory[lastcommontliIndex].tli); diff --git a/src/bin/pg_rewind/timeline.c b/src/bin/pg_rewind/timeline.c index df8f82a50c..0b3c524f16 100644 --- a/src/bin/pg_rewind/timeline.c +++ b/src/bin/pg_rewind/timeline.c @@ -79,7 +79,7 @@ rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries) if (nfields != 3) { pg_log_error("syntax error in history file: %s", fline); - pg_log_error("Expected a write-ahead log switchpoint location."); + pg_log_error("Expected a WAL switchpoint LSN."); exit(1); } if (entries && tli <= lasttli) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index a6251e1a96..f792e6351d 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -765,7 +765,7 @@ usage(void) printf(_(" %s [OPTION]... [STARTSEG [ENDSEG]]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); printf(_(" -n, --limit=N number of records to display\n")); printf(_(" -p, --path=PATH directory in which to find log segment files or a\n" @@ -774,7 +774,7 @@ usage(void) printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" " use --rmgr=list to list valid resource manager names\n")); - printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); + printf(_(" -s, --start=RECPTR start reading at WAL LSN RECPTR\n")); printf(_(" -t, --timeline=TLI timeline from which to read log records\n" " (default: 1 or the value used in STARTSEG)\n")); printf(_(" -V, --version output version information, then exit\n")); @@ -883,7 +883,7 @@ main(int argc, char **argv) case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse end WAL location \"%s\"", + pg_log_error("could not parse end WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -935,7 +935,7 @@ main(int argc, char **argv) case 's': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse start WAL location \"%s\"", + pg_log_error("could not parse start WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -1026,7 +1026,7 @@ main(int argc, char **argv) XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) { - pg_log_error("start WAL location %X/%X is not inside file \"%s\"", + pg_log_error("start WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.startptr), fname); goto bad_argument; @@ -1068,7 +1068,7 @@ main(int argc, char **argv) if (!XLByteInSeg(private.endptr, segno, WalSegSz) && private.endptr != (segno + 1) * WalSegSz) { - pg_log_error("end WAL location %X/%X is not inside file \"%s\"", + pg_log_error("end WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.endptr), argv[argc - 1]); goto bad_argument; @@ -1080,7 +1080,7 @@ main(int argc, char **argv) /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) { - pg_log_error("no start WAL location given"); + pg_log_error("no start WAL LSN given"); goto bad_argument; } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 7024dbe10a..c302544486 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6299,24 +6299,24 @@ proname => 'pg_create_restore_point', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => 'text', prosrc => 'pg_create_restore_point' }, -{ oid => '2849', descr => 'current wal write location', +{ oid => '2849', descr => 'current wal write lsn', proname => 'pg_current_wal_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_lsn' }, -{ oid => '2852', descr => 'current wal insert location', +{ oid => '2852', descr => 'current wal insert lsn', proname => 'pg_current_wal_insert_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_insert_lsn' }, -{ oid => '3330', descr => 'current wal flush location', +{ oid => '3330', descr => 'current wal flush lsn', proname => 'pg_current_wal_flush_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_flush_lsn' }, { oid => '2850', - descr => 'wal filename and byte offset, given a wal location', + descr => 'wal filename and byte offset, given a wal lsn', proname => 'pg_walfile_name_offset', prorettype => 'record', proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}', proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}', prosrc => 'pg_walfile_name_offset' }, -{ oid => '2851', descr => 'wal filename, given a wal location', +{ oid => '2851', descr => 'wal filename, given a wal lsn', proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn', prosrc => 'pg_walfile_name' }, @@ -6332,11 +6332,11 @@ proname => 'pg_is_in_recovery', provolatile => 'v', prorettype => 'bool', proargtypes => '', prosrc => 'pg_is_in_recovery' }, -{ oid => '3820', descr => 'current wal flush location', +{ oid => '3820', descr => 'current wal flush lsn', proname => 'pg_last_wal_receive_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_receive_lsn' }, -{ oid => '3821', descr => 'last wal replay location', +{ oid => '3821', descr => 'last wal replay lsn', proname => 'pg_last_wal_replay_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_replay_lsn' }, @@ -11509,7 +11509,7 @@ proparallel => 'r', prorettype => 'void', proargtypes => '', prosrc => 'pg_replication_origin_xact_reset' }, -{ oid => '6012', descr => 'advance replication origin to specific location', +{ oid => '6012', descr => 'advance replication origin to specific lsn', proname => 'pg_replication_origin_advance', provolatile => 'v', proparallel => 'u', prorettype => 'void', proargtypes => 'text pg_lsn', prosrc => 'pg_replication_origin_advance' }, -- 2.27.0 From 7af68e6bbebc52226fbb954e9cff78fca4253097 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Wed, 9 Feb 2022 11:33:47 +0900 Subject: [PATCH v9 5/5] Get rid of uses of some hyphenated words There are use of both "archive-recovery" and "archive recovery" in the tree. Unify the similar uses to unhyphenated. The use of "point-in-time-recovery" is left alone as it is used as the explanation for the acronym "PITR". --- doc/src/sgml/ref/pg_ctl-ref.sgml | 2 +- doc/src/sgml/ref/postgres-ref.sgml | 2 +- src/backend/access/transam/xlog.c | 2 +- src/backend/commands/trigger.c | 2 +- src/backend/utils/cache/relcache.c | 2 +- src/test/regress/parallel_schedule | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml index 3946fa52ea..7cbe5c3048 100644 --- a/doc/src/sgml/ref/pg_ctl-ref.sgml +++ b/doc/src/sgml/ref/pg_ctl-ref.sgml @@ -194,7 +194,7 @@ PostgreSQL documentation rolled back and clients are forcibly disconnected, then the server is shut down. <quote>Immediate</quote> mode will abort all server processes immediately, without a clean shutdown. This choice - will lead to a crash-recovery cycle during the next server start. + will lead to a crash recovery cycle during the next server start. </para> <para> diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml index 55a3f6c69d..fc22fbc4fe 100644 --- a/doc/src/sgml/ref/postgres-ref.sgml +++ b/doc/src/sgml/ref/postgres-ref.sgml @@ -719,7 +719,7 @@ PostgreSQL documentation is also unwise to send <literal>SIGKILL</literal> to a server process — the main <command>postgres</command> process will interpret this as a crash and will force all the sibling processes - to quit as part of its standard crash-recovery procedure. + to quit as part of its standard crash recovery procedure. </para> </refsect1> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index acc5ccf174..2d5ff5d751 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5691,7 +5691,7 @@ validateRecoveryParameters(void) } /* - * Exit archive-recovery state + * Exit archive recovery state */ static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog, TimeLineID newTLI) diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 1a9c1ac290..d454aa9015 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -1887,7 +1887,7 @@ RelationBuildTriggers(Relation relation) /* * Note: since we scan the triggers using TriggerRelidNameIndexId, we will * be reading the triggers in name order, except possibly during - * emergency-recovery operations (ie, IgnoreSystemIndexes). This in turn + * emergency recovery operations (ie, IgnoreSystemIndexes). This in turn * ensures that triggers will be fired in name order. */ ScanKeyInit(&skey, diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 2707fed12f..c14dd91825 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -6560,7 +6560,7 @@ RelationCacheInitFilePostInvalidate(void) * Remove the init files during postmaster startup. * * We used to keep the init files across restarts, but that is unsafe in PITR - * scenarios, and even in simple crash-recovery cases there are windows for + * scenarios, and even in simple crash recovery cases there are windows for * the init files to become out-of-sync with the database. So now we just * remove them during startup and expect the first backend launch to rebuild * them. Of course, this has to happen in each database of the cluster. diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 6d8f524ae9..8251744bbf 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -13,7 +13,7 @@ test: test_setup # run tablespace by itself, and early, because it forces a checkpoint; # we'd prefer not to have checkpoints later in the tests because that -# interferes with crash-recovery testing. +# interferes with crash recovery testing. test: tablespace # ---------- -- 2.27.0
On 2022/02/09 11:52, Kyotaro Horiguchi wrote: > 0001: The fix of CreateRestartPoint This patch might be ok for the master branch. But since concurrent checkpoint and restartpoint can happen in v14 or before,we would need another patch based on that assumption, for the backport. How about fixing the bug all the branchesat first, then apply this patch in the master to improve the code? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Fri, 11 Feb 2022 01:00:03 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2022/02/09 11:52, Kyotaro Horiguchi wrote: > > 0001: The fix of CreateRestartPoint > > This patch might be ok for the master branch. But since concurrent > checkpoint and restartpoint can happen in v14 or before, we would need > another patch based on that assumption, for the backport. How about > fixing the bug all the branches at first, then apply this patch in the > master to improve the code? For backbranches, the attached for pg14 does part of the full patch. Of the following, I think we should do (a) and (b) to make future backpatchings easier. a) Use RedoRecPtr and PriorRedoPtr after they are assigned. b) Move assignment to PriorRedoPtr into the ControlFileLock section. c) Skip udpate of minRecoveryPoint only when the checkpoint gets old. d) Skip call to UpdateCheckPointDistanceEstimate() when RedoRecPtr <= PriorRedoPtr. # Mmm. The v9-0001 contains a silly mistake here.. Still I'm not sure whether that case really happens and how checkpoint behaves *after* that happenes, but at least it protects database from the possible unrecoverable state due to the known issue here.. It doesn't apply even on pg13 (due to LSN_FORMAT_ARGS). I will make the per-version patches if you are fine with this. regards. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From df516b487617c570332ea076b91773b810629a11 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Mon, 14 Feb 2022 13:04:33 +0900 Subject: [PATCH v2] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Since all buffers have flushed out, we may safely regard that restart point established regardless of the recovery state. Thus we may and should update checkpoint LSNs of checkpoint file in that case. Still we update minRecoveryPoint only during archive recovery but explicitly clear if we have exited recovery. Addition to that fix, this commit makes some cosmetic changes that consist with the changes we are going to make on the master branch. --- src/backend/access/transam/xlog.c | 81 ++++++++++++++++++++----------- 1 file changed, 53 insertions(+), 28 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 6208e123e5..28c3c4b7cf 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9653,7 +9653,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9672,7 +9672,10 @@ CreateRestartPoint(int flags) /* Update the process title */ update_checkpoint_display(flags, true, false); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -9681,52 +9684,74 @@ CreateRestartPoint(int flags) PriorRedoPtr = ControlFile->checkPointCopy.redo; /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Update pg_control, using current time if no later checkpoints have been + * performed. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (PriorRedoPtr < RedoRecPtr) { ControlFile->checkPoint = lastCheckPointRecPtr; ControlFile->checkPointCopy = lastCheckPoint; ControlFile->time = (pg_time_t) time(NULL); /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened + * already while writing out dirty buffers, but not necessarily - + * e.g. because no buffers were dirtied. We do this because a + * non-exclusive base backup uses minRecoveryPoint to determine which + * WAL files must be included in the backup, and the file (or files) + * containing the checkpoint record must be included, at a + * minimum. Note that for an ordinary restart of recovery there's no + * value in having the minimum recovery point any earlier than this * anyway, because redo will begin just after the checkpoint record. + * This is a quick hack to make sure nothing really bad happens if + * somehow we get here after the end-of-recovery checkpoint. */ - if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->minRecoveryPoint = lastCheckPointEndPtr; - ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID; + if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) + { + ControlFile->minRecoveryPoint = lastCheckPointEndPtr; + ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID; - /* update local copy */ - minRecoveryPoint = ControlFile->minRecoveryPoint; - minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + /* update local copy */ + minRecoveryPoint = ControlFile->minRecoveryPoint; + minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; + } + if (flags & CHECKPOINT_IS_SHUTDOWN) + ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; } - if (flags & CHECKPOINT_IS_SHUTDOWN) - ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; + else + { + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + } + UpdateControlFile(); } LWLockRelease(ControlFileLock); /* * Update the average distance between checkpoints/restartpoints if the - * prior checkpoint exists. + * prior checkpoint exists and we have advanced REDO LSN. */ - if (PriorRedoPtr != InvalidXLogRecPtr) + if (PriorRedoPtr != InvalidXLogRecPtr && RedoRecPtr > PriorRedoPtr) UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr); + /* + * Now we can safely regard this restart point as established. + * + * Since all buffers have been flushed so we regard this restart point as + * established. We could omit some of the following steps in the case + * where we have omitted control file updates, but we don't bother avoid + * them from performing since that case rarely happens and they don't harm + * even if they take place. + */ + /* * Delete old log files, those no longer needed for last restartpoint to * prevent the disk holding the xlog from growing full. @@ -9804,7 +9829,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - LSN_FORMAT_ARGS(lastCheckPoint.redo)), + LSN_FORMAT_ARGS(RedoRecPtr)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); -- 2.27.0
At Mon, 14 Feb 2022 14:40:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > It doesn't apply even on pg13 (due to LSN_FORMAT_ARGS). I will make > the per-version patches if you are fine with this. Oops! I forgot to rename the patch to avoid confusion on CF-bots. I'll resend new version soon to avoid the confusion.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Mon, 14 Feb 2022 14:42:18 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > I'll resend new version soon to avoid the confusion.. In this version , 0001 gets one fix and two comment updates. - * Archive recovery have ended. Crash recovery ever after should always + * Archive recovery has ended. Crash recovery ever after should always - /* the second term is just in case */ - if (PriorRedoPtr != InvalidXLogRecPtr || RedoRecPtr > PriorRedoPtr) + /* + * Update the average distance between checkpoints/restartpoints if the + * prior checkpoint exists. The second term is just in case. + */ + if (PriorRedoPtr != InvalidXLogRecPtr && RedoRecPtr > PriorRedoPtr) UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr); regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 4f96baaece1cff35f89893d8e7aa1fdf57435f53 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Tue, 8 Feb 2022 16:42:53 +0900 Subject: [PATCH v10 1/5] Get rid of unused path to handle concurrent checkpoints CreateRestartPoint considered the case a concurrent checkpoint is running. But 7ff23c6d27 eliminates the possibility that multiple checkpoints run simultaneously. That code path, if it were passed, might leave unrecoverable database by removing WAL segments that are required by the last established restartpoint. In passing, the code in the function gets tightened-up and tidied-up in some points so that it gets easier to read. --- src/backend/access/transam/xlog.c | 67 ++++++++++++++++----------- src/backend/postmaster/checkpointer.c | 1 - 2 files changed, 41 insertions(+), 27 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 958220c495..8c2882b49f 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9659,6 +9659,9 @@ CreateRestartPoint(int flags) XLogSegNo _logSegNo; TimestampTz xtime; + /* we don't assume concurrent checkpoint/restartpoint to run */ + Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); + /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr; @@ -9724,7 +9727,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9743,7 +9746,12 @@ CreateRestartPoint(int flags) /* Update the process title */ update_checkpoint_display(flags, true, false); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* + * Update pg_control, using current time. + */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -9751,30 +9759,23 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9786,15 +9787,29 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* + * Archive recovery has ended. Crash recovery ever after should always + * recover to the end of WAL. + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + } + UpdateControlFile(); LWLockRelease(ControlFileLock); /* * Update the average distance between checkpoints/restartpoints if the * prior checkpoint exists. */ - if (PriorRedoPtr != InvalidXLogRecPtr) + + /* + * Update the average distance between checkpoints/restartpoints if the + * prior checkpoint exists. The second term is just in case. + */ + if (PriorRedoPtr != InvalidXLogRecPtr && RedoRecPtr > PriorRedoPtr) UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr); /* @@ -9864,7 +9879,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - LSN_FORMAT_ARGS(lastCheckPoint.redo)), + LSN_FORMAT_ARGS(RedoRecPtr)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index 23f691cd47..8bd4c47b6b 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -386,7 +386,6 @@ CheckpointerMain(void) /* Check if we should perform a checkpoint or a restartpoint. */ do_restartpoint = RecoveryInProgress(); - /* * Atomically fetch the request flags to figure out what kind of a * checkpoint we should perform, and increase the started-counter -- 2.27.0 From 5a78c6a41996d0bc5c2fb21724caf789a0a83d46 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 04:34:54 +0000 Subject: [PATCH v10 2/5] Add checkpoint and redo LSN to LogCheckpointEnd log message It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN(end) and REDO LSN(start). It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. --- src/backend/access/transam/xlog.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8c2882b49f..ba1b96bd1d 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8923,7 +8923,8 @@ LogCheckpointEnd(bool restartpoint) "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -8936,14 +8937,18 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); else ereport(LOG, (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -8956,7 +8961,10 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); } /* -- 2.27.0 From a3b441ee3b722fb3019c628cdb238c54cfb961c0 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 03:57:04 +0000 Subject: [PATCH v10 3/5] Change "location" to "lsn" in pg_controldata --- src/bin/pg_controldata/pg_controldata.c | 10 +++++----- src/test/recovery/t/016_min_consistency.pl | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f911f98d94..59f39267df 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -235,9 +235,9 @@ main(int argc, char *argv[]) dbState(ControlFile->state)); printf(_("pg_control last modified: %s\n"), pgctime_str); - printf(_("Latest checkpoint location: %X/%X\n"), + printf(_("Latest checkpoint LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPoint)); - printf(_("Latest checkpoint's REDO location: %X/%X\n"), + printf(_("Latest checkpoint's REDO LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)); printf(_("Latest checkpoint's REDO WAL file: %s\n"), xlogfilename); @@ -274,13 +274,13 @@ main(int argc, char *argv[]) ckpttime_str); printf(_("Fake LSN counter for unlogged rels: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->unloggedLSN)); - printf(_("Minimum recovery ending location: %X/%X\n"), + printf(_("Minimum recovery ending LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint)); printf(_("Min recovery ending loc's timeline: %u\n"), ControlFile->minRecoveryPointTLI); - printf(_("Backup start location: %X/%X\n"), + printf(_("Backup start LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupStartPoint)); - printf(_("Backup end location: %X/%X\n"), + printf(_("Backup end LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupEndPoint)); printf(_("End-of-backup record required: %s\n"), ControlFile->backupEndRequired ? _("yes") : _("no")); diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl index 5e0655c2a9..4ee20309cd 100644 --- a/src/test/recovery/t/016_min_consistency.pl +++ b/src/test/recovery/t/016_min_consistency.pl @@ -126,7 +126,7 @@ my @control_data = split("\n", $stdout); my $offline_recovery_lsn = undef; foreach (@control_data) { - if ($_ =~ /^Minimum recovery ending location:\s*(.*)$/mg) + if ($_ =~ /^Minimum recovery ending LSN:\s*(.*)$/mg) { $offline_recovery_lsn = $1; last; -- 2.27.0 From 42da33c3f25520af957dd62a7a8680ad9a7da43c Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 12:36:34 +0000 Subject: [PATCH v10 4/5] Change "location" to "lsn" in user facing text --- src/backend/access/rmgrdesc/xlogdesc.c | 2 +- src/backend/access/transam/xlog.c | 12 ++++++------ src/backend/replication/repl_scanner.l | 2 +- src/backend/replication/walsender.c | 2 +- src/bin/pg_basebackup/pg_basebackup.c | 6 +++--- src/bin/pg_basebackup/pg_receivewal.c | 2 +- src/bin/pg_basebackup/pg_recvlogical.c | 4 ++-- src/bin/pg_basebackup/streamutil.c | 2 +- src/bin/pg_rewind/pg_rewind.c | 2 +- src/bin/pg_rewind/timeline.c | 2 +- src/bin/pg_waldump/pg_waldump.c | 14 +++++++------- src/include/catalog/pg_proc.dat | 16 ++++++++-------- 12 files changed, 33 insertions(+), 33 deletions(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index e7452af679..a7ffbd7590 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -44,7 +44,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record) { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; " "oldest xid %u in DB %u; oldest multi %u in DB %u; " "oldest/newest commit timestamp xid: %u/%u; " diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ba1b96bd1d..153b6eb94d 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1858,7 +1858,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) if (upto > reservedUpto) { ereport(LOG, - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto)))); upto = reservedUpto; } @@ -5962,7 +5962,7 @@ recoveryStopsBefore(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping before WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -6125,7 +6125,7 @@ recoveryStopsAfter(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping after WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping after WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -6715,7 +6715,7 @@ StartupXLOG(void) */ if (!XRecOffIsValid(ControlFile->checkPoint)) ereport(FATAL, - (errmsg("control file contains invalid checkpoint location"))); + (errmsg("control file contains invalid checkpoint LSN"))); switch (ControlFile->state) { @@ -6843,7 +6843,7 @@ StartupXLOG(void) recoveryTargetName))); else if (recoveryTarget == RECOVERY_TARGET_LSN) ereport(LOG, - (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", + (errmsg("starting point-in-time recovery to WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryTargetLSN)))); else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE) ereport(LOG, @@ -6926,7 +6926,7 @@ StartupXLOG(void) if (!ReadRecord(xlogreader, LOG, false, checkPoint.ThisTimeLineID)) ereport(FATAL, - (errmsg("could not find redo location referenced by checkpoint record"), + (errmsg("could not find redo LSN referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add requiredrecovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoringfrom a backup.", diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index 4b64c0d768..a0f1c44cc7 100644 --- a/src/backend/replication/repl_scanner.l +++ b/src/backend/replication/repl_scanner.l @@ -138,7 +138,7 @@ WAIT { return K_WAIT; } uint32 hi, lo; if (sscanf(yytext, "%X/%X", &hi, &lo) != 2) - yyerror("invalid streaming start location"); + yyerror("invalid streaming start LSN"); yylval.recptr = ((uint64) hi) << 32 | lo; return RECPTR; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 655760fee3..0c9d86b72c 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -812,7 +812,7 @@ StartReplication(StartReplicationCmd *cmd) if (FlushPtr < cmd->startpoint) { ereport(ERROR, - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", LSN_FORMAT_ARGS(cmd->startpoint), LSN_FORMAT_ARGS(FlushPtr)))); } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 0003b59615..ce02ec4d02 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -473,7 +473,7 @@ reached_end_position(XLogRecPtr segendpos, uint32 timeline, if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } @@ -617,7 +617,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) /* Convert the starting position */ if (sscanf(startpos, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", startpos); exit(1); } @@ -2260,7 +2260,7 @@ BaseBackup(void) */ if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index ccb215c398..e8c1631745 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -753,7 +753,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index cc35d16f32..0bfb9edb94 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -781,7 +781,7 @@ main(int argc, char **argv) case 'I': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse start position \"%s\"", optarg); + pg_log_error("could not parse start LSN \"%s\"", optarg); exit(1); } startpos = ((uint64) hi) << 32 | lo; @@ -789,7 +789,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 4a6afd1a06..41d0473138 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -447,7 +447,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, { if (sscanf(PQgetvalue(res, 0, 2), "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", PQgetvalue(res, 0, 2)); PQclear(res); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index efb82a4034..7e4a1c2c30 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -343,7 +343,7 @@ main(int argc, char **argv) XLogRecPtr chkptendrec; findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", LSN_FORMAT_ARGS(divergerec), targetHistory[lastcommontliIndex].tli); diff --git a/src/bin/pg_rewind/timeline.c b/src/bin/pg_rewind/timeline.c index df8f82a50c..0b3c524f16 100644 --- a/src/bin/pg_rewind/timeline.c +++ b/src/bin/pg_rewind/timeline.c @@ -79,7 +79,7 @@ rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries) if (nfields != 3) { pg_log_error("syntax error in history file: %s", fline); - pg_log_error("Expected a write-ahead log switchpoint location."); + pg_log_error("Expected a WAL switchpoint LSN."); exit(1); } if (entries && tli <= lasttli) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index a6251e1a96..f792e6351d 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -765,7 +765,7 @@ usage(void) printf(_(" %s [OPTION]... [STARTSEG [ENDSEG]]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); printf(_(" -n, --limit=N number of records to display\n")); printf(_(" -p, --path=PATH directory in which to find log segment files or a\n" @@ -774,7 +774,7 @@ usage(void) printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" " use --rmgr=list to list valid resource manager names\n")); - printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); + printf(_(" -s, --start=RECPTR start reading at WAL LSN RECPTR\n")); printf(_(" -t, --timeline=TLI timeline from which to read log records\n" " (default: 1 or the value used in STARTSEG)\n")); printf(_(" -V, --version output version information, then exit\n")); @@ -883,7 +883,7 @@ main(int argc, char **argv) case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse end WAL location \"%s\"", + pg_log_error("could not parse end WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -935,7 +935,7 @@ main(int argc, char **argv) case 's': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse start WAL location \"%s\"", + pg_log_error("could not parse start WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -1026,7 +1026,7 @@ main(int argc, char **argv) XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) { - pg_log_error("start WAL location %X/%X is not inside file \"%s\"", + pg_log_error("start WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.startptr), fname); goto bad_argument; @@ -1068,7 +1068,7 @@ main(int argc, char **argv) if (!XLByteInSeg(private.endptr, segno, WalSegSz) && private.endptr != (segno + 1) * WalSegSz) { - pg_log_error("end WAL location %X/%X is not inside file \"%s\"", + pg_log_error("end WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.endptr), argv[argc - 1]); goto bad_argument; @@ -1080,7 +1080,7 @@ main(int argc, char **argv) /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) { - pg_log_error("no start WAL location given"); + pg_log_error("no start WAL LSN given"); goto bad_argument; } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 62f36daa98..4926533e95 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6305,24 +6305,24 @@ proname => 'pg_create_restore_point', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => 'text', prosrc => 'pg_create_restore_point' }, -{ oid => '2849', descr => 'current wal write location', +{ oid => '2849', descr => 'current wal write lsn', proname => 'pg_current_wal_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_lsn' }, -{ oid => '2852', descr => 'current wal insert location', +{ oid => '2852', descr => 'current wal insert lsn', proname => 'pg_current_wal_insert_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_insert_lsn' }, -{ oid => '3330', descr => 'current wal flush location', +{ oid => '3330', descr => 'current wal flush lsn', proname => 'pg_current_wal_flush_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_flush_lsn' }, { oid => '2850', - descr => 'wal filename and byte offset, given a wal location', + descr => 'wal filename and byte offset, given a wal lsn', proname => 'pg_walfile_name_offset', prorettype => 'record', proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}', proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}', prosrc => 'pg_walfile_name_offset' }, -{ oid => '2851', descr => 'wal filename, given a wal location', +{ oid => '2851', descr => 'wal filename, given a wal lsn', proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn', prosrc => 'pg_walfile_name' }, @@ -6338,11 +6338,11 @@ proname => 'pg_is_in_recovery', provolatile => 'v', prorettype => 'bool', proargtypes => '', prosrc => 'pg_is_in_recovery' }, -{ oid => '3820', descr => 'current wal flush location', +{ oid => '3820', descr => 'current wal flush lsn', proname => 'pg_last_wal_receive_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_receive_lsn' }, -{ oid => '3821', descr => 'last wal replay location', +{ oid => '3821', descr => 'last wal replay lsn', proname => 'pg_last_wal_replay_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_replay_lsn' }, @@ -11521,7 +11521,7 @@ proparallel => 'r', prorettype => 'void', proargtypes => '', prosrc => 'pg_replication_origin_xact_reset' }, -{ oid => '6012', descr => 'advance replication origin to specific location', +{ oid => '6012', descr => 'advance replication origin to specific lsn', proname => 'pg_replication_origin_advance', provolatile => 'v', proparallel => 'u', prorettype => 'void', proargtypes => 'text pg_lsn', prosrc => 'pg_replication_origin_advance' }, -- 2.27.0 From 1efc5e05204c874529d0d9c74b6978c9f7ca14fd Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Wed, 9 Feb 2022 11:33:47 +0900 Subject: [PATCH v10 5/5] Get rid of uses of some hyphenated words There are use of both "archive-recovery" and "archive recovery" in the tree. Unify the similar uses to unhyphenated. The use of "point-in-time-recovery" is left alone as it is used as the explanation for the acronym "PITR". --- doc/src/sgml/ref/pg_ctl-ref.sgml | 2 +- doc/src/sgml/ref/postgres-ref.sgml | 2 +- src/backend/access/transam/xlog.c | 2 +- src/backend/commands/trigger.c | 2 +- src/backend/utils/cache/relcache.c | 2 +- src/test/regress/parallel_schedule | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml index 3946fa52ea..7cbe5c3048 100644 --- a/doc/src/sgml/ref/pg_ctl-ref.sgml +++ b/doc/src/sgml/ref/pg_ctl-ref.sgml @@ -194,7 +194,7 @@ PostgreSQL documentation rolled back and clients are forcibly disconnected, then the server is shut down. <quote>Immediate</quote> mode will abort all server processes immediately, without a clean shutdown. This choice - will lead to a crash-recovery cycle during the next server start. + will lead to a crash recovery cycle during the next server start. </para> <para> diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml index 55a3f6c69d..fc22fbc4fe 100644 --- a/doc/src/sgml/ref/postgres-ref.sgml +++ b/doc/src/sgml/ref/postgres-ref.sgml @@ -719,7 +719,7 @@ PostgreSQL documentation is also unwise to send <literal>SIGKILL</literal> to a server process — the main <command>postgres</command> process will interpret this as a crash and will force all the sibling processes - to quit as part of its standard crash-recovery procedure. + to quit as part of its standard crash recovery procedure. </para> </refsect1> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 153b6eb94d..da2dd038fb 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5691,7 +5691,7 @@ validateRecoveryParameters(void) } /* - * Exit archive-recovery state + * Exit archive recovery state */ static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog, TimeLineID newTLI) diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 1a9c1ac290..d454aa9015 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -1887,7 +1887,7 @@ RelationBuildTriggers(Relation relation) /* * Note: since we scan the triggers using TriggerRelidNameIndexId, we will * be reading the triggers in name order, except possibly during - * emergency-recovery operations (ie, IgnoreSystemIndexes). This in turn + * emergency recovery operations (ie, IgnoreSystemIndexes). This in turn * ensures that triggers will be fired in name order. */ ScanKeyInit(&skey, diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 2707fed12f..c14dd91825 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -6560,7 +6560,7 @@ RelationCacheInitFilePostInvalidate(void) * Remove the init files during postmaster startup. * * We used to keep the init files across restarts, but that is unsafe in PITR - * scenarios, and even in simple crash-recovery cases there are windows for + * scenarios, and even in simple crash recovery cases there are windows for * the init files to become out-of-sync with the database. So now we just * remove them during startup and expect the first backend launch to rebuild * them. Of course, this has to happen in each database of the cluster. diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 6d8f524ae9..8251744bbf 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -13,7 +13,7 @@ test: test_setup # run tablespace by itself, and early, because it forces a checkpoint; # we'd prefer not to have checkpoints later in the tests because that -# interferes with crash-recovery testing. +# interferes with crash recovery testing. test: tablespace # ---------- -- 2.27.0
On 2022/02/14 14:40, Kyotaro Horiguchi wrote: > For backbranches, the attached for pg14 does part of the full patch. Thanks for updating the patch! > Of the following, I think we should do (a) and (b) to make future > backpatchings easier. > > a) Use RedoRecPtr and PriorRedoPtr after they are assigned. > > b) Move assignment to PriorRedoPtr into the ControlFileLock section. I failed to understand how (a) and (b) can make the backpatching easier. How easy to backpatch seems the same whether weapply (a) and (b) or not... > c) Skip udpate of minRecoveryPoint only when the checkpoint gets old. Yes. > d) Skip call to UpdateCheckPointDistanceEstimate() when RedoRecPtr <= > PriorRedoPtr. But "RedoRecPtr <= PriorRedoPtr" will never happen, will it? Because a restartpoint is skipped at the beginning of CreateRestartPoint()in that case. If this understanding is right, the check of "RedoRecPtr <= PriorRedoPtr" is not necessarybefore calling UpdateCheckPointDistanceEstimate(). + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; Don't we need to update LocalMinRecoveryPoint and LocalMinRecoveryPointTLI after this? Maybe it's not necessary, but ISTMthat it's safer and better to always update them whether the state is DB_IN_ARCHIVE_RECOVERY or not. if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; Same as above. IMO it's safer and better to always update the state (whether the state is DB_IN_ARCHIVE_RECOVERY or not)if CHECKPOINT_IS_SHUTDOWN flag is passed. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Tue, 22 Feb 2022 01:59:45 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > Of the following, I think we should do (a) and (b) to make future > > backpatchings easier. > > a) Use RedoRecPtr and PriorRedoPtr after they are assigned. > > b) Move assignment to PriorRedoPtr into the ControlFileLock section. > > I failed to understand how (a) and (b) can make the backpatching > easier. How easy to backpatch seems the same whether we apply (a) and > (b) or not... That premises that the patch applied to master contains (a) and (b). So if it doesn't, those are not need by older branches. > > c) Skip udpate of minRecoveryPoint only when the checkpoint gets old. > > Yes. > > > > d) Skip call to UpdateCheckPointDistanceEstimate() when RedoRecPtr <= > > PriorRedoPtr. > > But "RedoRecPtr <= PriorRedoPtr" will never happen, will it? Because a I didn't believe that it happens. (So, it came from my convervativeness, or laziness, or both:p) The code dates from 2009 and StartupXLOG makes a concurrent checkpoint with bgwriter. But as of at least 9.5, StartupXLOG doesn't directly call CreateCheckPoint. So I think that won't happen. So, in short, I agree to remove it or turn it into Assert(). > restartpoint is skipped at the beginning of CreateRestartPoint() in > that case. If this understanding is right, the check of "RedoRecPtr <= > PriorRedoPtr" is not necessary before calling > UpdateCheckPointDistanceEstimate(). > > > + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; > + ControlFile->minRecoveryPointTLI = 0; > > Don't we need to update LocalMinRecoveryPoint and > LocalMinRecoveryPointTLI after this? Maybe it's not necessary, but > ISTM that it's safer and better to always update them whether the > state is DB_IN_ARCHIVE_RECOVERY or not. Agree that it's safer and tidy. > if (flags & CHECKPOINT_IS_SHUTDOWN) > ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; > > Same as above. IMO it's safer and better to always update the state > (whether the state is DB_IN_ARCHIVE_RECOVERY or not) if > CHECKPOINT_IS_SHUTDOWN flag is passed. That means we may exit recovery mode after ShutdownXLOG called CreateRestartPoint. I don't think that may happen. So I'd rather add Assert ((flags&CHECKPOINT_IS_SHUTDOWN) == 0) there instaed. I'll post the new version later. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Tue, 22 Feb 2022 17:44:01 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Tue, 22 Feb 2022 01:59:45 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > > > Of the following, I think we should do (a) and (b) to make future > > > backpatchings easier. > > > a) Use RedoRecPtr and PriorRedoPtr after they are assigned. > > > b) Move assignment to PriorRedoPtr into the ControlFileLock section. > > > > I failed to understand how (a) and (b) can make the backpatching > > easier. How easy to backpatch seems the same whether we apply (a) and > > (b) or not... > > That premises that the patch applied to master contains (a) and (b). > So if it doesn't, those are not need by older branches. I was once going to remove them. But according the discussion below, the patch for back-patching is now quite close to that for the master branch. So I left them alone. > > > d) Skip call to UpdateCheckPointDistanceEstimate() when RedoRecPtr <= > > > PriorRedoPtr. > > > > But "RedoRecPtr <= PriorRedoPtr" will never happen, will it? Because a > > I didn't believe that it happens. (So, it came from my > convervativeness, or laziness, or both:p) The code dates from 2009 and > StartupXLOG makes a concurrent checkpoint with bgwriter. But as of at > least 9.5, StartupXLOG doesn't directly call CreateCheckPoint. So I > think that won't happen. > > So, in short, I agree to remove it or turn it into Assert(). It was a bit out of point. If we assume RedoRecPtr is always larger than PriorRedoPtr and then we don't need to check that there, we should also remove the "if (PriorRedoPtr < RedoRecPtr)" branch just above, which means the patch for back-branches gets very close to that for the master. Do we make such a large change on back branches? Anyways this version once takes that way. > > if (flags & CHECKPOINT_IS_SHUTDOWN) > > ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; > > > > Same as above. IMO it's safer and better to always update the state > > (whether the state is DB_IN_ARCHIVE_RECOVERY or not) if > > CHECKPOINT_IS_SHUTDOWN flag is passed. > > That means we may exit recovery mode after ShutdownXLOG called > CreateRestartPoint. I don't think that may happen. So I'd rather add > Assert ((flags&CHECKPOINT_IS_SHUTDOWN) == 0) there instaed. So this version for v14 gets updated in the following points. Completely removed the code path for the case some other process runs simultaneous checkpoint. Removed the condition (RedoRecPtr > PriorRedoPtr) for UpdateCheckPointDistanceEstimate() call. Added an assertion to the recoery-end path. # Honestly I feel this is a bit too much for back-patching, though. While making patches for v12, I see a test failure of pg_rewind for uncertain reason. I'm investigating that but I post this for discussion. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From e983f3d4c2dbeea742aed0ef1e209e7821f6687f Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Mon, 14 Feb 2022 13:04:33 +0900 Subject: [PATCH v2] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Along with that fix, we remove a dead code path for the case some other process ran a simultaneous checkpoint. It seems like just a preventive measure but it's no longer useful because we are sure that checkpoint is performed only by checkpointer except single process mode. --- src/backend/access/transam/xlog.c | 73 ++++++++++++++++++++----------- 1 file changed, 47 insertions(+), 26 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 6208e123e5..ff4a90eacc 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9587,6 +9587,9 @@ CreateRestartPoint(int flags) XLogSegNo _logSegNo; TimestampTz xtime; + /* we don't assume concurrent checkpoint/restartpoint to run */ + Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); + /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr; @@ -9653,7 +9656,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9672,7 +9675,10 @@ CreateRestartPoint(int flags) /* Update the process title */ update_checkpoint_display(flags, true, false); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -9680,31 +9686,29 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + Assert (PriorRedoPtr < RedoRecPtr); + + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + + /* Update control file using current time */ + ControlFile->time = (pg_time_t) time(NULL); + /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. This is a quick hack to make sure nothing really bad + * happens if somehow we get here after the end-of-recovery checkpoint. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - ControlFile->time = (pg_time_t) time(NULL); - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9716,8 +9720,25 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* recovery mode is not supposed to end during shutdown restartpoint */ + Assert((flags & CHECKPOINT_IS_SHUTDOWN) == 0); + + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + + /* also update local copy */ + minRecoveryPoint = InvalidXLogRecPtr; + minRecoveryPointTLI = 0; + } + + UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -9804,7 +9825,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - LSN_FORMAT_ARGS(lastCheckPoint.redo)), + LSN_FORMAT_ARGS(RedoRecPtr)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); -- 2.27.0 From 13329169b996509a3a853afb9c283c3b27e0eab7 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 25 Feb 2022 14:46:41 +0900 Subject: [PATCH v2] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Along with that fix, we remove a dead code path for the case some other process ran a simultaneous checkpoint. It seems like just a preventive measure but it's no longer useful because we are sure that checkpoint is performed only by checkpointer except single process mode. --- src/backend/access/transam/xlog.c | 73 ++++++++++++++++++++----------- 1 file changed, 47 insertions(+), 26 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3d76fad128..3670ff81e7 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9376,6 +9376,9 @@ CreateRestartPoint(int flags) */ LWLockAcquire(CheckpointLock, LW_EXCLUSIVE); + /* we don't assume concurrent checkpoint/restartpoint to run */ + Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); + /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr; @@ -9445,7 +9448,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9461,7 +9464,10 @@ CreateRestartPoint(int flags) if (log_checkpoints) LogCheckpointStart(flags, true); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -9469,31 +9475,29 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + Assert (PriorRedoPtr < RedoRecPtr); + + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + + /* Update control file using current time */ + ControlFile->time = (pg_time_t) time(NULL); + /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. This is a quick hack to make sure nothing really bad + * happens if somehow we get here after the end-of-recovery checkpoint. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - ControlFile->time = (pg_time_t) time(NULL); - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9505,8 +9509,25 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* recovery mode is not supposed to end during shutdown restartpoint */ + Assert((flags & CHECKPOINT_IS_SHUTDOWN) == 0); + + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + + /* also update local copy */ + minRecoveryPoint = InvalidXLogRecPtr; + minRecoveryPointTLI = 0; + } + + UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -9590,7 +9611,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - (uint32) (lastCheckPoint.redo >> 32), (uint32) lastCheckPoint.redo), + (uint32) (RedoRecPtr >> 32), (uint32) RedoRecPtr), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); -- 2.27.0
At Fri, 25 Feb 2022 15:31:12 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > While making patches for v12, I see a test failure of pg_rewind for > uncertain reason. I'm investigating that but I post this for > discussion. Hmm. Too stupid. Somehow I overly removed the latchet condition for minRecoveryPoint. So the same patch worked for v12. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Fri, 25 Feb 2022 16:06:31 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Fri, 25 Feb 2022 15:31:12 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > > While making patches for v12, I see a test failure of pg_rewind for > > uncertain reason. I'm investigating that but I post this for > > discussion. > > Hmm. Too stupid. Somehow I overly removed the latchet condition for > minRecoveryPoint. So the same patch worked for v12. So, this is the patches for pg12-10. 11 can share the same patch with 12. 10 has differences in two points. 10 has ControlFile->prevCheckPoint. The DETAILS of the "recovery restart point at" message is not capitalized. But I suppose it is so close to EOL so that we don't want to "fix" it risking existing usecases. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From c89e2b509723b68897f2af49a154af2a69f0747b Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 25 Feb 2022 15:04:00 +0900 Subject: [PATCH v3] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Along with that fix, we remove a dead code path for the case some other process ran a simultaneous checkpoint. It seems like just a preventive measure but it's no longer useful because we are sure that checkpoint is performed only by checkpointer except single process mode. --- src/backend/access/transam/xlog.c | 71 +++++++++++++++++++------------ 1 file changed, 44 insertions(+), 27 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 885558f291..2b2568c475 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9334,7 +9334,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9350,7 +9350,10 @@ CreateRestartPoint(int flags) if (log_checkpoints) LogCheckpointStart(flags, true); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -9358,31 +9361,28 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + Assert (PriorRedoPtr < RedoRecPtr); + + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + + /* Update control file using current time */ + ControlFile->time = (pg_time_t) time(NULL); + /* - * Update pg_control, using current time. Check that it still shows - * IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - ControlFile->time = (pg_time_t) time(NULL); - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9393,9 +9393,26 @@ CreateRestartPoint(int flags) minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; } if (flags & CHECKPOINT_IS_SHUTDOWN) - ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); + ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; } + else + { + /* recovery mode is not supposed to end during shutdown restartpoint */ + Assert((flags & CHECKPOINT_IS_SHUTDOWN) == 0); + + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + + /* also update local copy */ + minRecoveryPoint = InvalidXLogRecPtr; + minRecoveryPointTLI = 0; + } + + UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -9470,7 +9487,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - (uint32) (lastCheckPoint.redo >> 32), (uint32) lastCheckPoint.redo), + (uint32) (RedoRecPtr >> 32), (uint32) RedoRecPtr), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); -- 2.27.0 From 7dd174d165b3639b573bfc47c2e8b2fba61395c5 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 25 Feb 2022 16:35:16 +0900 Subject: [PATCH v3] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Along with that fix, we remove a dead code path for the case some other process ran a simultaneous checkpoint. It seems like just a preventive measure but it's no longer useful because we are sure that checkpoint is performed only by checkpointer except single process mode. --- src/backend/access/transam/xlog.c | 73 +++++++++++++++++++------------ 1 file changed, 45 insertions(+), 28 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index c64febdb53..9fb66ad7d5 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9434,7 +9434,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -9450,7 +9450,10 @@ CreateRestartPoint(int flags) if (log_checkpoints) LogCheckpointStart(flags, true); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo pointer, used later to determine @@ -9458,32 +9461,29 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + Assert (PriorRedoPtr < RedoRecPtr); + + ControlFile->prevCheckPoint = ControlFile->checkPoint; + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + + /* Update control file using current time */ + ControlFile->time = (pg_time_t) time(NULL); + /* - * Update pg_control, using current time. Check that it still shows - * IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still running. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->prevCheckPoint = ControlFile->checkPoint; - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - ControlFile->time = (pg_time_t) time(NULL); - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -9494,9 +9494,26 @@ CreateRestartPoint(int flags) minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; } if (flags & CHECKPOINT_IS_SHUTDOWN) - ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); + ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; } + else + { + /* recovery mode is not supposed to end during shutdown restartpoint */ + Assert((flags & CHECKPOINT_IS_SHUTDOWN) == 0); + + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + + /* also update local copy */ + minRecoveryPoint = InvalidXLogRecPtr; + minRecoveryPointTLI = 0; + } + + UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -9579,7 +9596,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - (uint32) (lastCheckPoint.redo >> 32), (uint32) lastCheckPoint.redo), + (uint32) (RedoRecPtr >> 32), (uint32) RedoRecPtr), xtime ? errdetail("last completed transaction was at log time %s", timestamptz_to_str(xtime)) : 0)); -- 2.27.0
At Fri, 25 Feb 2022 16:47:01 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > So, this is the patches for pg12-10. 11 can share the same patch with > 12. 10 has differences in two points. > > 10 has ControlFile->prevCheckPoint. > > The DETAILS of the "recovery restart point at" message is not > capitalized. But I suppose it is so close to EOL so that we don't > want to "fix" it risking existing usecases. Ugh! Wait for a moment. Something's wrong. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Fri, 25 Feb 2022 16:52:30 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > Ugh! Wait for a moment. Something's wrong. Sorry, what is wrong was my working directory. It was broken by my bogus operation. All the files apply corresponding versions correctly. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Mon, 14 Feb 2022 14:52:15 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > In this version , 0001 gets one fix and two comment updates. While the disucssion on back-patching of 0001 proceeding on another branch, the main patch iself gets looks like as if rotten on CF-App. So I rebased v10 on the current master. 0001 is replaced by an adjusted patch based on the latest "control file update fix" patch. If someone wants to voice on the message-fix patches (0002-0004), be our guest. 0005 also wants opinions. 0001: Fix possible incorrect controlfile update that leads to unrecoverable database. 0002: Add REDO/Checkpiont LSNs to checkpoinkt-end log message. (The main patch in this thread) 0003: Replace (WAL-)location to LSN in pg_controldata. 0004: Replace (WAL-)location to LSN in user-facing texts. (This doesn't reflect my recent comments.) 0005: Unhyphenate the word archive-recovery and similars. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 0bf164c66ba2c27856c014ed1d452e5e07678541 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 4 Mar 2022 13:18:30 +0900 Subject: [PATCH v11 1/5] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Along with that fix, we remove a dead code path for the case some other process ran a simultaneous checkpoint. It seems like just a preventive measure but it's no longer useful because we are sure that checkpoint is performed only by checkpointer except single process mode. --- src/backend/access/transam/xlog.c | 72 ++++++++++++++++++++----------- 1 file changed, 47 insertions(+), 25 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0d2bd7a357..3987aa81de 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6899,6 +6899,9 @@ CreateRestartPoint(int flags) XLogSegNo _logSegNo; TimestampTz xtime; + /* we don't assume concurrent checkpoint/restartpoint to run */ + Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); + /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr; @@ -6964,7 +6967,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -6983,7 +6986,10 @@ CreateRestartPoint(int flags) /* Update the process title */ update_checkpoint_display(flags, true, false); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -6991,30 +6997,29 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + Assert (PriorRedoPtr < RedoRecPtr); + + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + + /* Update control file using current time */ + ControlFile->time = (pg_time_t) time(NULL); + /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. This is a quick hack to make sure nothing really bad + * happens if somehow we get here after the end-of-recovery checkpoint. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -7026,8 +7031,25 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* recovery mode is not supposed to end during shutdown restartpoint */ + Assert((flags & CHECKPOINT_IS_SHUTDOWN) == 0); + + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + + /* also update local copy */ + LocalMinRecoveryPoint = InvalidXLogRecPtr; + LocalMinRecoveryPointTLI = 0; + } + + UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -7104,7 +7126,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - LSN_FORMAT_ARGS(lastCheckPoint.redo)), + LSN_FORMAT_ARGS(RedoRecPtr)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); -- 2.27.0 From 1dc663152ba00903f1bd25e7b28540fd97c41bd7 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:22:41 +0900 Subject: [PATCH v11 2/5] Add checkpoint and redo LSN to LogCheckpointEnd log message It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN(end) and REDO LSN(start). It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. --- src/backend/access/transam/xlog.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3987aa81de..2e1123fd42 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6120,7 +6120,8 @@ LogCheckpointEnd(bool restartpoint) "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6133,14 +6134,18 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); else ereport(LOG, (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6153,7 +6158,10 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); } /* -- 2.27.0 From 5fb445874e83a111cd41364d9bb3ed39e8002c10 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 03:57:04 +0000 Subject: [PATCH v11 3/5] Change "location" to "lsn" in pg_controldata --- src/bin/pg_controldata/pg_controldata.c | 10 +++++----- src/test/recovery/t/016_min_consistency.pl | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f911f98d94..59f39267df 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -235,9 +235,9 @@ main(int argc, char *argv[]) dbState(ControlFile->state)); printf(_("pg_control last modified: %s\n"), pgctime_str); - printf(_("Latest checkpoint location: %X/%X\n"), + printf(_("Latest checkpoint LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPoint)); - printf(_("Latest checkpoint's REDO location: %X/%X\n"), + printf(_("Latest checkpoint's REDO LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)); printf(_("Latest checkpoint's REDO WAL file: %s\n"), xlogfilename); @@ -274,13 +274,13 @@ main(int argc, char *argv[]) ckpttime_str); printf(_("Fake LSN counter for unlogged rels: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->unloggedLSN)); - printf(_("Minimum recovery ending location: %X/%X\n"), + printf(_("Minimum recovery ending LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint)); printf(_("Min recovery ending loc's timeline: %u\n"), ControlFile->minRecoveryPointTLI); - printf(_("Backup start location: %X/%X\n"), + printf(_("Backup start LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupStartPoint)); - printf(_("Backup end location: %X/%X\n"), + printf(_("Backup end LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupEndPoint)); printf(_("End-of-backup record required: %s\n"), ControlFile->backupEndRequired ? _("yes") : _("no")); diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl index 5e0655c2a9..4ee20309cd 100644 --- a/src/test/recovery/t/016_min_consistency.pl +++ b/src/test/recovery/t/016_min_consistency.pl @@ -126,7 +126,7 @@ my @control_data = split("\n", $stdout); my $offline_recovery_lsn = undef; foreach (@control_data) { - if ($_ =~ /^Minimum recovery ending location:\s*(.*)$/mg) + if ($_ =~ /^Minimum recovery ending LSN:\s*(.*)$/mg) { $offline_recovery_lsn = $1; last; -- 2.27.0 From 7cf3f07a1821d9e5d8dc952f045f79869a38e108 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:49:16 +0900 Subject: [PATCH v11 4/5] Change "location" to "lsn" in user facing text --- src/backend/access/rmgrdesc/xlogdesc.c | 2 +- src/backend/access/transam/xlog.c | 4 ++-- src/backend/access/transam/xlogrecovery.c | 8 ++++---- src/backend/replication/repl_scanner.l | 2 +- src/backend/replication/walsender.c | 2 +- src/bin/pg_basebackup/pg_basebackup.c | 6 +++--- src/bin/pg_basebackup/pg_receivewal.c | 2 +- src/bin/pg_basebackup/pg_recvlogical.c | 4 ++-- src/bin/pg_basebackup/streamutil.c | 2 +- src/bin/pg_rewind/pg_rewind.c | 2 +- src/bin/pg_rewind/timeline.c | 2 +- src/bin/pg_waldump/pg_waldump.c | 14 +++++++------- src/include/catalog/pg_proc.dat | 16 ++++++++-------- 13 files changed, 33 insertions(+), 33 deletions(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index e7452af679..a7ffbd7590 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -44,7 +44,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record) { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; " "oldest xid %u in DB %u; oldest multi %u in DB %u; " "oldest/newest commit timestamp xid: %u/%u; " diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2e1123fd42..e566c27942 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1470,7 +1470,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) if (upto > reservedUpto) { ereport(LOG, - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto)))); upto = reservedUpto; } @@ -4915,7 +4915,7 @@ StartupXLOG(void) */ if (!XRecOffIsValid(ControlFile->checkPoint)) ereport(FATAL, - (errmsg("control file contains invalid checkpoint location"))); + (errmsg("control file contains invalid checkpoint LSN"))); switch (ControlFile->state) { diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index f9f212680b..dc822da402 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -530,7 +530,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, recoveryTargetName))); else if (recoveryTarget == RECOVERY_TARGET_LSN) ereport(LOG, - (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", + (errmsg("starting point-in-time recovery to WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryTargetLSN)))); else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE) ereport(LOG, @@ -611,7 +611,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, if (!ReadRecord(xlogreader, LOG, false, checkPoint.ThisTimeLineID)) ereport(FATAL, - (errmsg("could not find redo location referenced by checkpoint record"), + (errmsg("could not find redo LSN referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add requiredrecovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoringfrom a backup.", @@ -2405,7 +2405,7 @@ recoveryStopsBefore(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping before WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -2568,7 +2568,7 @@ recoveryStopsAfter(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping after WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping after WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index 4b64c0d768..a0f1c44cc7 100644 --- a/src/backend/replication/repl_scanner.l +++ b/src/backend/replication/repl_scanner.l @@ -138,7 +138,7 @@ WAIT { return K_WAIT; } uint32 hi, lo; if (sscanf(yytext, "%X/%X", &hi, &lo) != 2) - yyerror("invalid streaming start location"); + yyerror("invalid streaming start LSN"); yylval.recptr = ((uint64) hi) << 32 | lo; return RECPTR; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 5a718b1fe9..8dc420e9a5 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -813,7 +813,7 @@ StartReplication(StartReplicationCmd *cmd) if (FlushPtr < cmd->startpoint) { ereport(ERROR, - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", LSN_FORMAT_ARGS(cmd->startpoint), LSN_FORMAT_ARGS(FlushPtr)))); } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index c1ed7aeeee..f27d471baf 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -487,7 +487,7 @@ reached_end_position(XLogRecPtr segendpos, uint32 timeline, if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } @@ -642,7 +642,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) /* Convert the starting position */ if (sscanf(startpos, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", startpos); exit(1); } @@ -2310,7 +2310,7 @@ BaseBackup(void) */ if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index ce661a9ce4..c0503a7cdb 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -753,7 +753,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index cc35d16f32..0bfb9edb94 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -781,7 +781,7 @@ main(int argc, char **argv) case 'I': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse start position \"%s\"", optarg); + pg_log_error("could not parse start LSN \"%s\"", optarg); exit(1); } startpos = ((uint64) hi) << 32 | lo; @@ -789,7 +789,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 4a6afd1a06..41d0473138 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -447,7 +447,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, { if (sscanf(PQgetvalue(res, 0, 2), "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", PQgetvalue(res, 0, 2)); PQclear(res); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index b39b5c1aac..61e5c38629 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -344,7 +344,7 @@ main(int argc, char **argv) XLogRecPtr chkptendrec; findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", LSN_FORMAT_ARGS(divergerec), targetHistory[lastcommontliIndex].tli); diff --git a/src/bin/pg_rewind/timeline.c b/src/bin/pg_rewind/timeline.c index df8f82a50c..0b3c524f16 100644 --- a/src/bin/pg_rewind/timeline.c +++ b/src/bin/pg_rewind/timeline.c @@ -79,7 +79,7 @@ rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries) if (nfields != 3) { pg_log_error("syntax error in history file: %s", fline); - pg_log_error("Expected a write-ahead log switchpoint location."); + pg_log_error("Expected a WAL switchpoint LSN."); exit(1); } if (entries && tli <= lasttli) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 2340dc247b..ae3573e537 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -762,7 +762,7 @@ usage(void) printf(_(" %s [OPTION]... [STARTSEG [ENDSEG]]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); printf(_(" -n, --limit=N number of records to display\n")); printf(_(" -p, --path=PATH directory in which to find log segment files or a\n" @@ -771,7 +771,7 @@ usage(void) printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" " use --rmgr=list to list valid resource manager names\n")); - printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); + printf(_(" -s, --start=RECPTR start reading at WAL LSN RECPTR\n")); printf(_(" -t, --timeline=TLI timeline from which to read log records\n" " (default: 1 or the value used in STARTSEG)\n")); printf(_(" -V, --version output version information, then exit\n")); @@ -880,7 +880,7 @@ main(int argc, char **argv) case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse end WAL location \"%s\"", + pg_log_error("could not parse end WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -932,7 +932,7 @@ main(int argc, char **argv) case 's': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse start WAL location \"%s\"", + pg_log_error("could not parse start WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -1023,7 +1023,7 @@ main(int argc, char **argv) XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) { - pg_log_error("start WAL location %X/%X is not inside file \"%s\"", + pg_log_error("start WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.startptr), fname); goto bad_argument; @@ -1065,7 +1065,7 @@ main(int argc, char **argv) if (!XLByteInSeg(private.endptr, segno, WalSegSz) && private.endptr != (segno + 1) * WalSegSz) { - pg_log_error("end WAL location %X/%X is not inside file \"%s\"", + pg_log_error("end WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.endptr), argv[argc - 1]); goto bad_argument; @@ -1077,7 +1077,7 @@ main(int argc, char **argv) /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) { - pg_log_error("no start WAL location given"); + pg_log_error("no start WAL LSN given"); goto bad_argument; } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index d8e8715ed1..3e83f68daf 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6299,24 +6299,24 @@ proname => 'pg_create_restore_point', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => 'text', prosrc => 'pg_create_restore_point' }, -{ oid => '2849', descr => 'current wal write location', +{ oid => '2849', descr => 'current wal write lsn', proname => 'pg_current_wal_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_lsn' }, -{ oid => '2852', descr => 'current wal insert location', +{ oid => '2852', descr => 'current wal insert lsn', proname => 'pg_current_wal_insert_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_insert_lsn' }, -{ oid => '3330', descr => 'current wal flush location', +{ oid => '3330', descr => 'current wal flush lsn', proname => 'pg_current_wal_flush_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_flush_lsn' }, { oid => '2850', - descr => 'wal filename and byte offset, given a wal location', + descr => 'wal filename and byte offset, given a wal lsn', proname => 'pg_walfile_name_offset', prorettype => 'record', proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}', proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}', prosrc => 'pg_walfile_name_offset' }, -{ oid => '2851', descr => 'wal filename, given a wal location', +{ oid => '2851', descr => 'wal filename, given a wal lsn', proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn', prosrc => 'pg_walfile_name' }, @@ -6332,11 +6332,11 @@ proname => 'pg_is_in_recovery', provolatile => 'v', prorettype => 'bool', proargtypes => '', prosrc => 'pg_is_in_recovery' }, -{ oid => '3820', descr => 'current wal flush location', +{ oid => '3820', descr => 'current wal flush lsn', proname => 'pg_last_wal_receive_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_receive_lsn' }, -{ oid => '3821', descr => 'last wal replay location', +{ oid => '3821', descr => 'last wal replay lsn', proname => 'pg_last_wal_replay_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_replay_lsn' }, @@ -11515,7 +11515,7 @@ proparallel => 'r', prorettype => 'void', proargtypes => '', prosrc => 'pg_replication_origin_xact_reset' }, -{ oid => '6012', descr => 'advance replication origin to specific location', +{ oid => '6012', descr => 'advance replication origin to specific lsn', proname => 'pg_replication_origin_advance', provolatile => 'v', proparallel => 'u', prorettype => 'void', proargtypes => 'text pg_lsn', prosrc => 'pg_replication_origin_advance' }, -- 2.27.0 From 73d64acef8fa6d64203b700389b0ef83e6904ec7 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 4 Mar 2022 13:53:38 +0900 Subject: [PATCH v11 5/5] Get rid of uses of some hyphenated words There are use of both "archive-recovery" and "archive recovery" in the tree. Unify the similar uses to unhyphenated. The use of "point-in-time-recovery" is left alone as it is used as the explanation for the acronym "PITR". --- doc/src/sgml/ref/pg_ctl-ref.sgml | 2 +- src/backend/commands/trigger.c | 2 +- src/backend/utils/cache/relcache.c | 2 +- src/test/regress/parallel_schedule | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml index 3946fa52ea..7cbe5c3048 100644 --- a/doc/src/sgml/ref/pg_ctl-ref.sgml +++ b/doc/src/sgml/ref/pg_ctl-ref.sgml @@ -194,7 +194,7 @@ PostgreSQL documentation rolled back and clients are forcibly disconnected, then the server is shut down. <quote>Immediate</quote> mode will abort all server processes immediately, without a clean shutdown. This choice - will lead to a crash-recovery cycle during the next server start. + will lead to a crash recovery cycle during the next server start. </para> <para> diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 1a9c1ac290..d454aa9015 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -1887,7 +1887,7 @@ RelationBuildTriggers(Relation relation) /* * Note: since we scan the triggers using TriggerRelidNameIndexId, we will * be reading the triggers in name order, except possibly during - * emergency-recovery operations (ie, IgnoreSystemIndexes). This in turn + * emergency recovery operations (ie, IgnoreSystemIndexes). This in turn * ensures that triggers will be fired in name order. */ ScanKeyInit(&skey, diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index fccffce572..2419cf5285 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -6598,7 +6598,7 @@ RelationCacheInitFilePostInvalidate(void) * Remove the init files during postmaster startup. * * We used to keep the init files across restarts, but that is unsafe in PITR - * scenarios, and even in simple crash-recovery cases there are windows for + * scenarios, and even in simple crash recovery cases there are windows for * the init files to become out-of-sync with the database. So now we just * remove them during startup and expect the first backend launch to rebuild * them. Of course, this has to happen in each database of the cluster. diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 6d8f524ae9..8251744bbf 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -13,7 +13,7 @@ test: test_setup # run tablespace by itself, and early, because it forces a checkpoint; # we'd prefer not to have checkpoints later in the tests because that -# interferes with crash-recovery testing. +# interferes with crash recovery testing. test: tablespace # ---------- -- 2.27.0
On Fri, Mar 4, 2022 at 10:40 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Mon, 14 Feb 2022 14:52:15 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > > In this version , 0001 gets one fix and two comment updates. > > While the disucssion on back-patching of 0001 proceeding on another > branch, the main patch iself gets looks like as if rotten on > CF-App. So I rebased v10 on the current master. 0001 is replaced by > an adjusted patch based on the latest "control file update fix" patch. > > 0001: Fix possible incorrect controlfile update that leads to > unrecoverable database. 0001 - I don't think you need to do this as UpdateControlFile (update_controlfile) will anyway update it, no? + /* Update control file using current time */ + ControlFile->time = (pg_time_t) time(NULL); > 0002: Add REDO/Checkpiont LSNs to checkpoinkt-end log message. > (The main patch in this thread) 0002 - If at all the intention is to say that no ControlFileLock is required while reading ControlFile->checkPoint and ControlFile->checkPointCopy.redo, let's say it, no? How about something like "No ControlFileLock is required while reading ControlFile->checkPoint and ControlFile->checkPointCopy.redo as there can't be any other process updating them concurrently."? + /* we are the only updator of these variables */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); > 0003: Replace (WAL-)location to LSN in pg_controldata. > > 0004: Replace (WAL-)location to LSN in user-facing texts. > (This doesn't reflect my recent comments.) If you don't mind, can you please put the comments here? > 0005: Unhyphenate the word archive-recovery and similars. 0005 - How about replacing "crash-recovery" to "crash recovery" in postgres-ref.sgml too? Regards, Bharath Rupireddy.
At Tue, 15 Mar 2022 12:19:47 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Fri, Mar 4, 2022 at 10:40 AM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > 0001 - I don't think you need to do this as UpdateControlFile > (update_controlfile) will anyway update it, no? > + /* Update control file using current time */ > + ControlFile->time = (pg_time_t) time(NULL); Ugh.. Yes. It is a copy-pasto from older versions. They may have the same copy-pasto.. > > 0002: Add REDO/Checkpiont LSNs to checkpoinkt-end log message. > > (The main patch in this thread) > > 0002 - If at all the intention is to say that no ControlFileLock is > required while reading ControlFile->checkPoint and > ControlFile->checkPointCopy.redo, let's say it, no? How about > something like "No ControlFileLock is required while reading > ControlFile->checkPoint and ControlFile->checkPointCopy.redo as there > can't be any other process updating them concurrently."? > > + /* we are the only updator of these variables */ > + LSN_FORMAT_ARGS(ControlFile->checkPoint), > + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); I thought the comment explains that. But it would be better to be more specific. It is changed as the follows. > * ControlFileLock is not required as we are the only > * updator of these variables. > > 0003: Replace (WAL-)location to LSN in pg_controldata. > > > > 0004: Replace (WAL-)location to LSN in user-facing texts. > > (This doesn't reflect my recent comments.) > > If you don't mind, can you please put the comments here? Okay. It's the following message. https://www.postgresql.org/message-id/20220209.115204.1794224638476710282.horikyota.ntt@gmail.com > The old 0003 (attached 0004): > > > > +++ b/src/backend/access/rmgrdesc/xlogdesc.c > - appendStringInfo(buf, "redo %X/%X; " > + appendStringInfo(buf, "redo lsn %X/%X; " > > > > It is shown in the context of a checkpoint record, so I think it is > not needed or rather lengthning the dump line uselessly. > > > > +++ b/src/backend/access/transam/xlog.c > - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", > + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", > > > > +++ b/src/backend/replication/walsender.c > - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", > + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", > > > > "WAL" is upper-cased. So it seems rather strange that the "lsn" is > lower-cased. In the first place the message doesn't look like a > user-facing error message and I feel we don't need position or lsn > there.. > > > > +++ b/src/bin/pg_rewind/pg_rewind.c > - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", > + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", > > > > I feel that we don't need "WAL" there. > > > > +++ b/src/bin/pg_waldump/pg_waldump.c > - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); > + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); > > > > Mmm.. "WAL LSN RECPTR" looks strange to me. In the first place I > don't think "RECPTR" is a user-facing term. Doesn't something like the > follows work? > > > > + printf(_(" -e, --end=WAL-LSN stop reading at WAL-LSN\n")); > > > > In some changes in this patch shorten the main message text of > fprintf-ish functions. That makes the succeeding parameters can be > inlined. > > 0005: Unhyphenate the word archive-recovery and similars. > > 0005 - How about replacing "crash-recovery" to "crash recovery" in > postgres-ref.sgml too? Oh, that's a left-over. Fixed. Thanks! regards. -- Kyotaro Horiguchi NTT Open Source Software Center From 37f7433dce1ea9e41aa54f2c18c0bf3d2aa3c814 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 4 Mar 2022 13:18:30 +0900 Subject: [PATCH v12 1/5] Correctly update contfol file at the end of archive recovery CreateRestartPoint runs WAL file cleanup basing on the checkpoint just have finished in the function. If the database has exited DB_IN_ARCHIVE_RECOVERY state when the function is going to update control file, the function refrains from updating the file at all then proceeds to WAL cleanup having the latest REDO LSN, which is now inconsistent with the control file. As the result, the succeeding cleanup procedure overly removes WAL files against the control file and leaves unrecoverable database until the next checkpoint finishes. Along with that fix, we remove a dead code path for the case some other process ran a simultaneous checkpoint. It seems like just a preventive measure but it's no longer useful because we are sure that checkpoint is performed only by checkpointer except single process mode. --- src/backend/access/transam/xlog.c | 69 ++++++++++++++++++++----------- 1 file changed, 44 insertions(+), 25 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0d2bd7a357..bd962763cc 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6899,6 +6899,9 @@ CreateRestartPoint(int flags) XLogSegNo _logSegNo; TimestampTz xtime; + /* we don't assume concurrent checkpoint/restartpoint to run */ + Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER); + /* Get a local copy of the last safe checkpoint record. */ SpinLockAcquire(&XLogCtl->info_lck); lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr; @@ -6964,7 +6967,7 @@ CreateRestartPoint(int flags) /* Also update the info_lck-protected copy */ SpinLockAcquire(&XLogCtl->info_lck); - XLogCtl->RedoRecPtr = lastCheckPoint.redo; + XLogCtl->RedoRecPtr = RedoRecPtr; SpinLockRelease(&XLogCtl->info_lck); /* @@ -6983,7 +6986,10 @@ CreateRestartPoint(int flags) /* Update the process title */ update_checkpoint_display(flags, true, false); - CheckPointGuts(lastCheckPoint.redo, flags); + CheckPointGuts(RedoRecPtr, flags); + + /* Update pg_control */ + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); /* * Remember the prior checkpoint's redo ptr for @@ -6991,30 +6997,26 @@ CreateRestartPoint(int flags) */ PriorRedoPtr = ControlFile->checkPointCopy.redo; + Assert (PriorRedoPtr < RedoRecPtr); + + ControlFile->checkPoint = lastCheckPointRecPtr; + ControlFile->checkPointCopy = lastCheckPoint; + /* - * Update pg_control, using current time. Check that it still shows - * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing; - * this is a quick hack to make sure nothing really bad happens if somehow - * we get here after the end-of-recovery checkpoint. + * Ensure minRecoveryPoint is past the checkpoint record while archive + * recovery is still ongoing. Normally, this will have happened already + * while writing out dirty buffers, but not necessarily - e.g. because no + * buffers were dirtied. We do this because a non-exclusive base backup + * uses minRecoveryPoint to determine which WAL files must be included in + * the backup, and the file (or files) containing the checkpoint record + * must be included, at a minimum. Note that for an ordinary restart of + * recovery there's no value in having the minimum recovery point any + * earlier than this anyway, because redo will begin just after the + * checkpoint record. This is a quick hack to make sure nothing really bad + * happens if somehow we get here after the end-of-recovery checkpoint. */ - LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) { - ControlFile->checkPoint = lastCheckPointRecPtr; - ControlFile->checkPointCopy = lastCheckPoint; - - /* - * Ensure minRecoveryPoint is past the checkpoint record. Normally, - * this will have happened already while writing out dirty buffers, - * but not necessarily - e.g. because no buffers were dirtied. We do - * this because a non-exclusive base backup uses minRecoveryPoint to - * determine which WAL files must be included in the backup, and the - * file (or files) containing the checkpoint record must be included, - * at a minimum. Note that for an ordinary restart of recovery there's - * no value in having the minimum recovery point any earlier than this - * anyway, because redo will begin just after the checkpoint record. - */ if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr) { ControlFile->minRecoveryPoint = lastCheckPointEndPtr; @@ -7026,8 +7028,25 @@ CreateRestartPoint(int flags) } if (flags & CHECKPOINT_IS_SHUTDOWN) ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY; - UpdateControlFile(); } + else + { + /* recovery mode is not supposed to end during shutdown restartpoint */ + Assert((flags & CHECKPOINT_IS_SHUTDOWN) == 0); + + /* + * Aarchive recovery has ended. Crash recovery ever after should + * always recover to the end of WAL + */ + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; + ControlFile->minRecoveryPointTLI = 0; + + /* also update local copy */ + LocalMinRecoveryPoint = InvalidXLogRecPtr; + LocalMinRecoveryPointTLI = 0; + } + + UpdateControlFile(); LWLockRelease(ControlFileLock); /* @@ -7104,7 +7123,7 @@ CreateRestartPoint(int flags) xtime = GetLatestXTime(); ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("recovery restart point at %X/%X", - LSN_FORMAT_ARGS(lastCheckPoint.redo)), + LSN_FORMAT_ARGS(RedoRecPtr)), xtime ? errdetail("Last completed transaction was at log time %s.", timestamptz_to_str(xtime)) : 0)); -- 2.27.0 From dc1db06e3638f6c200c86bf09fded562cef6701a Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:22:41 +0900 Subject: [PATCH v12 2/5] Add checkpoint and redo LSN to LogCheckpointEnd log message It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN(end) and REDO LSN(start). It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. --- src/backend/access/transam/xlog.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index bd962763cc..ce3815ea99 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6120,7 +6120,8 @@ LogCheckpointEnd(bool restartpoint) "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6133,14 +6134,21 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* + * ControlFileLock is not required as we are the only + * updator of these variables. + */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); else ereport(LOG, (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6153,7 +6161,13 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* + * ControlFileLock is not required as we are the only + * updator of these variables. + */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); } /* -- 2.27.0 From 1941cdf0d48194044ef8f30d28f532beb51526dc Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 03:57:04 +0000 Subject: [PATCH v12 3/5] Change "location" to "lsn" in pg_controldata --- src/bin/pg_controldata/pg_controldata.c | 10 +++++----- src/test/recovery/t/016_min_consistency.pl | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f911f98d94..59f39267df 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -235,9 +235,9 @@ main(int argc, char *argv[]) dbState(ControlFile->state)); printf(_("pg_control last modified: %s\n"), pgctime_str); - printf(_("Latest checkpoint location: %X/%X\n"), + printf(_("Latest checkpoint LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPoint)); - printf(_("Latest checkpoint's REDO location: %X/%X\n"), + printf(_("Latest checkpoint's REDO LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)); printf(_("Latest checkpoint's REDO WAL file: %s\n"), xlogfilename); @@ -274,13 +274,13 @@ main(int argc, char *argv[]) ckpttime_str); printf(_("Fake LSN counter for unlogged rels: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->unloggedLSN)); - printf(_("Minimum recovery ending location: %X/%X\n"), + printf(_("Minimum recovery ending LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint)); printf(_("Min recovery ending loc's timeline: %u\n"), ControlFile->minRecoveryPointTLI); - printf(_("Backup start location: %X/%X\n"), + printf(_("Backup start LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupStartPoint)); - printf(_("Backup end location: %X/%X\n"), + printf(_("Backup end LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupEndPoint)); printf(_("End-of-backup record required: %s\n"), ControlFile->backupEndRequired ? _("yes") : _("no")); diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl index 5e0655c2a9..4ee20309cd 100644 --- a/src/test/recovery/t/016_min_consistency.pl +++ b/src/test/recovery/t/016_min_consistency.pl @@ -126,7 +126,7 @@ my @control_data = split("\n", $stdout); my $offline_recovery_lsn = undef; foreach (@control_data) { - if ($_ =~ /^Minimum recovery ending location:\s*(.*)$/mg) + if ($_ =~ /^Minimum recovery ending LSN:\s*(.*)$/mg) { $offline_recovery_lsn = $1; last; -- 2.27.0 From 985fc119636a12cf7293ed058ef8c1c10bc47cb2 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:49:16 +0900 Subject: [PATCH v12 4/5] Change "location" to "lsn" in user facing text --- src/backend/access/rmgrdesc/xlogdesc.c | 2 +- src/backend/access/transam/xlog.c | 4 ++-- src/backend/access/transam/xlogrecovery.c | 8 ++++---- src/backend/replication/repl_scanner.l | 2 +- src/backend/replication/walsender.c | 2 +- src/bin/pg_basebackup/pg_basebackup.c | 6 +++--- src/bin/pg_basebackup/pg_receivewal.c | 2 +- src/bin/pg_basebackup/pg_recvlogical.c | 4 ++-- src/bin/pg_basebackup/streamutil.c | 2 +- src/bin/pg_rewind/pg_rewind.c | 2 +- src/bin/pg_rewind/timeline.c | 2 +- src/bin/pg_waldump/pg_waldump.c | 14 +++++++------- src/include/catalog/pg_proc.dat | 16 ++++++++-------- 13 files changed, 33 insertions(+), 33 deletions(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index e7452af679..a7ffbd7590 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -44,7 +44,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record) { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; " "oldest xid %u in DB %u; oldest multi %u in DB %u; " "oldest/newest commit timestamp xid: %u/%u; " diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ce3815ea99..bdf03f07ab 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1470,7 +1470,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) if (upto > reservedUpto) { ereport(LOG, - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto)))); upto = reservedUpto; } @@ -4915,7 +4915,7 @@ StartupXLOG(void) */ if (!XRecOffIsValid(ControlFile->checkPoint)) ereport(FATAL, - (errmsg("control file contains invalid checkpoint location"))); + (errmsg("control file contains invalid checkpoint LSN"))); switch (ControlFile->state) { diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index f9f212680b..dc822da402 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -530,7 +530,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, recoveryTargetName))); else if (recoveryTarget == RECOVERY_TARGET_LSN) ereport(LOG, - (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", + (errmsg("starting point-in-time recovery to WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryTargetLSN)))); else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE) ereport(LOG, @@ -611,7 +611,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, if (!ReadRecord(xlogreader, LOG, false, checkPoint.ThisTimeLineID)) ereport(FATAL, - (errmsg("could not find redo location referenced by checkpoint record"), + (errmsg("could not find redo LSN referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add requiredrecovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoringfrom a backup.", @@ -2405,7 +2405,7 @@ recoveryStopsBefore(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping before WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -2568,7 +2568,7 @@ recoveryStopsAfter(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping after WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping after WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index 4b64c0d768..a0f1c44cc7 100644 --- a/src/backend/replication/repl_scanner.l +++ b/src/backend/replication/repl_scanner.l @@ -138,7 +138,7 @@ WAIT { return K_WAIT; } uint32 hi, lo; if (sscanf(yytext, "%X/%X", &hi, &lo) != 2) - yyerror("invalid streaming start location"); + yyerror("invalid streaming start LSN"); yylval.recptr = ((uint64) hi) << 32 | lo; return RECPTR; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 2d0292a092..95a043525f 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -813,7 +813,7 @@ StartReplication(StartReplicationCmd *cmd) if (FlushPtr < cmd->startpoint) { ereport(ERROR, - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", LSN_FORMAT_ARGS(cmd->startpoint), LSN_FORMAT_ARGS(FlushPtr)))); } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index d265ee3b3c..44d3ae1322 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -488,7 +488,7 @@ reached_end_position(XLogRecPtr segendpos, uint32 timeline, if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } @@ -643,7 +643,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) /* Convert the starting position */ if (sscanf(startpos, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", startpos); exit(1); } @@ -2363,7 +2363,7 @@ BaseBackup(void) */ if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index 8a4c2b8964..30d43ca598 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -753,7 +753,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index cc35d16f32..0bfb9edb94 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -781,7 +781,7 @@ main(int argc, char **argv) case 'I': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse start position \"%s\"", optarg); + pg_log_error("could not parse start LSN \"%s\"", optarg); exit(1); } startpos = ((uint64) hi) << 32 | lo; @@ -789,7 +789,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 4a6afd1a06..41d0473138 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -447,7 +447,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, { if (sscanf(PQgetvalue(res, 0, 2), "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", PQgetvalue(res, 0, 2)); PQclear(res); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index b39b5c1aac..61e5c38629 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -344,7 +344,7 @@ main(int argc, char **argv) XLogRecPtr chkptendrec; findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", LSN_FORMAT_ARGS(divergerec), targetHistory[lastcommontliIndex].tli); diff --git a/src/bin/pg_rewind/timeline.c b/src/bin/pg_rewind/timeline.c index df8f82a50c..0b3c524f16 100644 --- a/src/bin/pg_rewind/timeline.c +++ b/src/bin/pg_rewind/timeline.c @@ -79,7 +79,7 @@ rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries) if (nfields != 3) { pg_log_error("syntax error in history file: %s", fline); - pg_log_error("Expected a write-ahead log switchpoint location."); + pg_log_error("Expected a WAL switchpoint LSN."); exit(1); } if (entries && tli <= lasttli) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index f128050b4e..5ed1942e24 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -764,7 +764,7 @@ usage(void) printf(_(" %s [OPTION]... [STARTSEG [ENDSEG]]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); printf(_(" -n, --limit=N number of records to display\n")); printf(_(" -p, --path=PATH directory in which to find log segment files or a\n" @@ -773,7 +773,7 @@ usage(void) printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" " use --rmgr=list to list valid resource manager names\n")); - printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); + printf(_(" -s, --start=RECPTR start reading at WAL LSN RECPTR\n")); printf(_(" -t, --timeline=TLI timeline from which to read log records\n" " (default: 1 or the value used in STARTSEG)\n")); printf(_(" -V, --version output version information, then exit\n")); @@ -882,7 +882,7 @@ main(int argc, char **argv) case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse end WAL location \"%s\"", + pg_log_error("could not parse end WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -934,7 +934,7 @@ main(int argc, char **argv) case 's': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse start WAL location \"%s\"", + pg_log_error("could not parse start WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -1025,7 +1025,7 @@ main(int argc, char **argv) XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) { - pg_log_error("start WAL location %X/%X is not inside file \"%s\"", + pg_log_error("start WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.startptr), fname); goto bad_argument; @@ -1067,7 +1067,7 @@ main(int argc, char **argv) if (!XLByteInSeg(private.endptr, segno, WalSegSz) && private.endptr != (segno + 1) * WalSegSz) { - pg_log_error("end WAL location %X/%X is not inside file \"%s\"", + pg_log_error("end WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.endptr), argv[argc - 1]); goto bad_argument; @@ -1079,7 +1079,7 @@ main(int argc, char **argv) /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) { - pg_log_error("no start WAL location given"); + pg_log_error("no start WAL LSN given"); goto bad_argument; } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index d8e8715ed1..3e83f68daf 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6299,24 +6299,24 @@ proname => 'pg_create_restore_point', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => 'text', prosrc => 'pg_create_restore_point' }, -{ oid => '2849', descr => 'current wal write location', +{ oid => '2849', descr => 'current wal write lsn', proname => 'pg_current_wal_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_lsn' }, -{ oid => '2852', descr => 'current wal insert location', +{ oid => '2852', descr => 'current wal insert lsn', proname => 'pg_current_wal_insert_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_insert_lsn' }, -{ oid => '3330', descr => 'current wal flush location', +{ oid => '3330', descr => 'current wal flush lsn', proname => 'pg_current_wal_flush_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_flush_lsn' }, { oid => '2850', - descr => 'wal filename and byte offset, given a wal location', + descr => 'wal filename and byte offset, given a wal lsn', proname => 'pg_walfile_name_offset', prorettype => 'record', proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}', proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}', prosrc => 'pg_walfile_name_offset' }, -{ oid => '2851', descr => 'wal filename, given a wal location', +{ oid => '2851', descr => 'wal filename, given a wal lsn', proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn', prosrc => 'pg_walfile_name' }, @@ -6332,11 +6332,11 @@ proname => 'pg_is_in_recovery', provolatile => 'v', prorettype => 'bool', proargtypes => '', prosrc => 'pg_is_in_recovery' }, -{ oid => '3820', descr => 'current wal flush location', +{ oid => '3820', descr => 'current wal flush lsn', proname => 'pg_last_wal_receive_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_receive_lsn' }, -{ oid => '3821', descr => 'last wal replay location', +{ oid => '3821', descr => 'last wal replay lsn', proname => 'pg_last_wal_replay_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_replay_lsn' }, @@ -11515,7 +11515,7 @@ proparallel => 'r', prorettype => 'void', proargtypes => '', prosrc => 'pg_replication_origin_xact_reset' }, -{ oid => '6012', descr => 'advance replication origin to specific location', +{ oid => '6012', descr => 'advance replication origin to specific lsn', proname => 'pg_replication_origin_advance', provolatile => 'v', proparallel => 'u', prorettype => 'void', proargtypes => 'text pg_lsn', prosrc => 'pg_replication_origin_advance' }, -- 2.27.0 From 535816060ef99816814198921017ab622a795d25 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 4 Mar 2022 13:53:38 +0900 Subject: [PATCH v12 5/5] Get rid of uses of some hyphenated words There are use of both "archive-recovery" and "archive recovery" in the tree. Unify the similar uses to unhyphenated. The use of "point-in-time-recovery" is left alone as it is used as the explanation for the acronym "PITR". --- doc/src/sgml/ref/pg_ctl-ref.sgml | 2 +- doc/src/sgml/ref/postgres-ref.sgml | 2 +- src/backend/commands/trigger.c | 2 +- src/backend/utils/cache/relcache.c | 2 +- src/test/regress/parallel_schedule | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml index 3946fa52ea..7cbe5c3048 100644 --- a/doc/src/sgml/ref/pg_ctl-ref.sgml +++ b/doc/src/sgml/ref/pg_ctl-ref.sgml @@ -194,7 +194,7 @@ PostgreSQL documentation rolled back and clients are forcibly disconnected, then the server is shut down. <quote>Immediate</quote> mode will abort all server processes immediately, without a clean shutdown. This choice - will lead to a crash-recovery cycle during the next server start. + will lead to a crash recovery cycle during the next server start. </para> <para> diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml index 55a3f6c69d..fc22fbc4fe 100644 --- a/doc/src/sgml/ref/postgres-ref.sgml +++ b/doc/src/sgml/ref/postgres-ref.sgml @@ -719,7 +719,7 @@ PostgreSQL documentation is also unwise to send <literal>SIGKILL</literal> to a server process — the main <command>postgres</command> process will interpret this as a crash and will force all the sibling processes - to quit as part of its standard crash-recovery procedure. + to quit as part of its standard crash recovery procedure. </para> </refsect1> diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index e08bd9a370..ce170b4c6e 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -1887,7 +1887,7 @@ RelationBuildTriggers(Relation relation) /* * Note: since we scan the triggers using TriggerRelidNameIndexId, we will * be reading the triggers in name order, except possibly during - * emergency-recovery operations (ie, IgnoreSystemIndexes). This in turn + * emergency recovery operations (ie, IgnoreSystemIndexes). This in turn * ensures that triggers will be fired in name order. */ ScanKeyInit(&skey, diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index fccffce572..2419cf5285 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -6598,7 +6598,7 @@ RelationCacheInitFilePostInvalidate(void) * Remove the init files during postmaster startup. * * We used to keep the init files across restarts, but that is unsafe in PITR - * scenarios, and even in simple crash-recovery cases there are windows for + * scenarios, and even in simple crash recovery cases there are windows for * the init files to become out-of-sync with the database. So now we just * remove them during startup and expect the first backend launch to rebuild * them. Of course, this has to happen in each database of the cluster. diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 6d8f524ae9..8251744bbf 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -13,7 +13,7 @@ test: test_setup # run tablespace by itself, and early, because it forces a checkpoint; # we'd prefer not to have checkpoints later in the tests because that -# interferes with crash-recovery testing. +# interferes with crash recovery testing. test: tablespace # ---------- -- 2.27.0
On Tue, Mar 15, 2022 at 05:23:40PM +0900, Kyotaro Horiguchi wrote: > At Tue, 15 Mar 2022 12:19:47 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in >> On Fri, Mar 4, 2022 at 10:40 AM Kyotaro Horiguchi >> <horikyota.ntt@gmail.com> wrote: >> 0001 - I don't think you need to do this as UpdateControlFile >> (update_controlfile) will anyway update it, no? >> + /* Update control file using current time */ >> + ControlFile->time = (pg_time_t) time(NULL); > > Ugh.. Yes. It is a copy-pasto from older versions. They may have the > same copy-pasto.. This thread has shifted to an entirely different discussion, presenting patches that touch code paths unrelated to what was first stated. Shouldn't you create a new thread with a proper $subject to attract a more correct audience? -- Michael
Attachment
At Tue, 15 Mar 2022 18:26:26 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Tue, Mar 15, 2022 at 05:23:40PM +0900, Kyotaro Horiguchi wrote: > > At Tue, 15 Mar 2022 12:19:47 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > >> On Fri, Mar 4, 2022 at 10:40 AM Kyotaro Horiguchi > >> <horikyota.ntt@gmail.com> wrote: > >> 0001 - I don't think you need to do this as UpdateControlFile > >> (update_controlfile) will anyway update it, no? > >> + /* Update control file using current time */ > >> + ControlFile->time = (pg_time_t) time(NULL); > > > > Ugh.. Yes. It is a copy-pasto from older versions. They may have the > > same copy-pasto.. > > This thread has shifted to an entirely different discussion, > presenting patches that touch code paths unrelated to what was first > stated. Shouldn't you create a new thread with a proper $subject to > attract a more correct audience? I felt the same since some messages ago. I thought Fujii-san thought that he wants to fix the CreateRestartPoint issue before the checkpoint log patch but he looks like busy these days. Since the CreateRestartPoint issue is orthogonal to the main patch, I'll separate that part into anther thread. This thread is discussing one other topic, wordings in user-facing texts. This is also orthogonal (3-dimentionally?) to the two topics. In short, I split out the two topics other than checkpoint log to other threads. Thanks for cueing me to do that! regareds. -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, this is a follow-up topic of [1] (add LSNs to checkpint logs). Many user-facing texts contains wording like "WAL location" or such like. The attached is WIP patches that fixes such wordings to "WAL LSN" or alikes. The attached patches is v13 but it is not changed at all from v12. My lastest comments on this version are as follows. https://www.postgresql.org/message-id/20220209.115204.1794224638476710282.horikyota.ntt@gmail.com > The old 0003 (attached 0004): > > > > +++ b/src/backend/access/rmgrdesc/xlogdesc.c > - appendStringInfo(buf, "redo %X/%X; " > + appendStringInfo(buf, "redo lsn %X/%X; " > > > > It is shown in the context of a checkpoint record, so I think it is > not needed or rather lengthning the dump line uselessly. > > > > +++ b/src/backend/access/transam/xlog.c > - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", > + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", > > > > +++ b/src/backend/replication/walsender.c > - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", > + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", > > > > "WAL" is upper-cased. So it seems rather strange that the "lsn" is > lower-cased. In the first place the message doesn't look like a > user-facing error message and I feel we don't need position or lsn > there.. > > > > +++ b/src/bin/pg_rewind/pg_rewind.c > - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", > + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", > > > > I feel that we don't need "WAL" there. > > > > +++ b/src/bin/pg_waldump/pg_waldump.c > - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); > + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); > > > > Mmm.. "WAL LSN RECPTR" looks strange to me. In the first place I > don't think "RECPTR" is a user-facing term. Doesn't something like the > follows work? > > > > + printf(_(" -e, --end=WAL-LSN stop reading at WAL-LSN\n")); > > > > In some changes in this patch shorten the main message text of > fprintf-ish functions. That makes the succeeding parameters can be > inlined. regards. [1] https://www.postgresql.org/message-id/20220316.091913.806120467943749797.horikyota.ntt%40gmail.com -- Kyotaro Horiguchi NTT Open Source Software Center From afed063f09b788bc07a0e2665499c3e3c3edc214 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 03:57:04 +0000 Subject: [PATCH v13 1/2] Change "location" to "lsn" in pg_controldata --- src/bin/pg_controldata/pg_controldata.c | 10 +++++----- src/test/recovery/t/016_min_consistency.pl | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f911f98d94..59f39267df 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -235,9 +235,9 @@ main(int argc, char *argv[]) dbState(ControlFile->state)); printf(_("pg_control last modified: %s\n"), pgctime_str); - printf(_("Latest checkpoint location: %X/%X\n"), + printf(_("Latest checkpoint LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPoint)); - printf(_("Latest checkpoint's REDO location: %X/%X\n"), + printf(_("Latest checkpoint's REDO LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)); printf(_("Latest checkpoint's REDO WAL file: %s\n"), xlogfilename); @@ -274,13 +274,13 @@ main(int argc, char *argv[]) ckpttime_str); printf(_("Fake LSN counter for unlogged rels: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->unloggedLSN)); - printf(_("Minimum recovery ending location: %X/%X\n"), + printf(_("Minimum recovery ending LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint)); printf(_("Min recovery ending loc's timeline: %u\n"), ControlFile->minRecoveryPointTLI); - printf(_("Backup start location: %X/%X\n"), + printf(_("Backup start LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupStartPoint)); - printf(_("Backup end location: %X/%X\n"), + printf(_("Backup end LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupEndPoint)); printf(_("End-of-backup record required: %s\n"), ControlFile->backupEndRequired ? _("yes") : _("no")); diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl index 5e0655c2a9..4ee20309cd 100644 --- a/src/test/recovery/t/016_min_consistency.pl +++ b/src/test/recovery/t/016_min_consistency.pl @@ -126,7 +126,7 @@ my @control_data = split("\n", $stdout); my $offline_recovery_lsn = undef; foreach (@control_data) { - if ($_ =~ /^Minimum recovery ending location:\s*(.*)$/mg) + if ($_ =~ /^Minimum recovery ending LSN:\s*(.*)$/mg) { $offline_recovery_lsn = $1; last; -- 2.27.0 From 4cc657baf834bdab64b7aa4665962f7634ae73fb Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:49:16 +0900 Subject: [PATCH v13 2/2] Change "location" to "lsn" in user facing text --- src/backend/access/rmgrdesc/xlogdesc.c | 2 +- src/backend/access/transam/xlog.c | 4 ++-- src/backend/access/transam/xlogrecovery.c | 8 ++++---- src/backend/replication/repl_scanner.l | 2 +- src/backend/replication/walsender.c | 2 +- src/bin/pg_basebackup/pg_basebackup.c | 6 +++--- src/bin/pg_basebackup/pg_receivewal.c | 2 +- src/bin/pg_basebackup/pg_recvlogical.c | 4 ++-- src/bin/pg_basebackup/streamutil.c | 2 +- src/bin/pg_rewind/pg_rewind.c | 2 +- src/bin/pg_rewind/timeline.c | 2 +- src/bin/pg_waldump/pg_waldump.c | 14 +++++++------- src/include/catalog/pg_proc.dat | 16 ++++++++-------- 13 files changed, 33 insertions(+), 33 deletions(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index e7452af679..a7ffbd7590 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -44,7 +44,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record) { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; " "oldest xid %u in DB %u; oldest multi %u in DB %u; " "oldest/newest commit timestamp xid: %u/%u; " diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ed16f279b1..5e7516f67b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1471,7 +1471,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) if (upto > reservedUpto) { ereport(LOG, - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto)))); upto = reservedUpto; } @@ -4916,7 +4916,7 @@ StartupXLOG(void) */ if (!XRecOffIsValid(ControlFile->checkPoint)) ereport(FATAL, - (errmsg("control file contains invalid checkpoint location"))); + (errmsg("control file contains invalid checkpoint LSN"))); switch (ControlFile->state) { diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index f9f212680b..dc822da402 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -530,7 +530,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, recoveryTargetName))); else if (recoveryTarget == RECOVERY_TARGET_LSN) ereport(LOG, - (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", + (errmsg("starting point-in-time recovery to WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryTargetLSN)))); else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE) ereport(LOG, @@ -611,7 +611,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, if (!ReadRecord(xlogreader, LOG, false, checkPoint.ThisTimeLineID)) ereport(FATAL, - (errmsg("could not find redo location referenced by checkpoint record"), + (errmsg("could not find redo LSN referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add requiredrecovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoringfrom a backup.", @@ -2405,7 +2405,7 @@ recoveryStopsBefore(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping before WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -2568,7 +2568,7 @@ recoveryStopsAfter(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping after WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping after WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index 4b64c0d768..a0f1c44cc7 100644 --- a/src/backend/replication/repl_scanner.l +++ b/src/backend/replication/repl_scanner.l @@ -138,7 +138,7 @@ WAIT { return K_WAIT; } uint32 hi, lo; if (sscanf(yytext, "%X/%X", &hi, &lo) != 2) - yyerror("invalid streaming start location"); + yyerror("invalid streaming start LSN"); yylval.recptr = ((uint64) hi) << 32 | lo; return RECPTR; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 2d0292a092..95a043525f 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -813,7 +813,7 @@ StartReplication(StartReplicationCmd *cmd) if (FlushPtr < cmd->startpoint) { ereport(ERROR, - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", LSN_FORMAT_ARGS(cmd->startpoint), LSN_FORMAT_ARGS(FlushPtr)))); } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 2943d9ec1a..7951ae4312 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -488,7 +488,7 @@ reached_end_position(XLogRecPtr segendpos, uint32 timeline, if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } @@ -643,7 +643,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) /* Convert the starting position */ if (sscanf(startpos, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", startpos); exit(1); } @@ -2363,7 +2363,7 @@ BaseBackup(void) */ if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index e2ceafeb0f..20425699af 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -753,7 +753,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index cc35d16f32..0bfb9edb94 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -781,7 +781,7 @@ main(int argc, char **argv) case 'I': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse start position \"%s\"", optarg); + pg_log_error("could not parse start LSN \"%s\"", optarg); exit(1); } startpos = ((uint64) hi) << 32 | lo; @@ -789,7 +789,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 4a6afd1a06..41d0473138 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -447,7 +447,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, { if (sscanf(PQgetvalue(res, 0, 2), "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", PQgetvalue(res, 0, 2)); PQclear(res); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index b39b5c1aac..61e5c38629 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -344,7 +344,7 @@ main(int argc, char **argv) XLogRecPtr chkptendrec; findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", LSN_FORMAT_ARGS(divergerec), targetHistory[lastcommontliIndex].tli); diff --git a/src/bin/pg_rewind/timeline.c b/src/bin/pg_rewind/timeline.c index df8f82a50c..0b3c524f16 100644 --- a/src/bin/pg_rewind/timeline.c +++ b/src/bin/pg_rewind/timeline.c @@ -79,7 +79,7 @@ rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries) if (nfields != 3) { pg_log_error("syntax error in history file: %s", fline); - pg_log_error("Expected a write-ahead log switchpoint location."); + pg_log_error("Expected a WAL switchpoint LSN."); exit(1); } if (entries && tli <= lasttli) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index f128050b4e..5ed1942e24 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -764,7 +764,7 @@ usage(void) printf(_(" %s [OPTION]... [STARTSEG [ENDSEG]]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); printf(_(" -n, --limit=N number of records to display\n")); printf(_(" -p, --path=PATH directory in which to find log segment files or a\n" @@ -773,7 +773,7 @@ usage(void) printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" " use --rmgr=list to list valid resource manager names\n")); - printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); + printf(_(" -s, --start=RECPTR start reading at WAL LSN RECPTR\n")); printf(_(" -t, --timeline=TLI timeline from which to read log records\n" " (default: 1 or the value used in STARTSEG)\n")); printf(_(" -V, --version output version information, then exit\n")); @@ -882,7 +882,7 @@ main(int argc, char **argv) case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse end WAL location \"%s\"", + pg_log_error("could not parse end WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -934,7 +934,7 @@ main(int argc, char **argv) case 's': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse start WAL location \"%s\"", + pg_log_error("could not parse start WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -1025,7 +1025,7 @@ main(int argc, char **argv) XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) { - pg_log_error("start WAL location %X/%X is not inside file \"%s\"", + pg_log_error("start WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.startptr), fname); goto bad_argument; @@ -1067,7 +1067,7 @@ main(int argc, char **argv) if (!XLByteInSeg(private.endptr, segno, WalSegSz) && private.endptr != (segno + 1) * WalSegSz) { - pg_log_error("end WAL location %X/%X is not inside file \"%s\"", + pg_log_error("end WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.endptr), argv[argc - 1]); goto bad_argument; @@ -1079,7 +1079,7 @@ main(int argc, char **argv) /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) { - pg_log_error("no start WAL location given"); + pg_log_error("no start WAL LSN given"); goto bad_argument; } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index d8e8715ed1..3e83f68daf 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6299,24 +6299,24 @@ proname => 'pg_create_restore_point', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => 'text', prosrc => 'pg_create_restore_point' }, -{ oid => '2849', descr => 'current wal write location', +{ oid => '2849', descr => 'current wal write lsn', proname => 'pg_current_wal_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_lsn' }, -{ oid => '2852', descr => 'current wal insert location', +{ oid => '2852', descr => 'current wal insert lsn', proname => 'pg_current_wal_insert_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_insert_lsn' }, -{ oid => '3330', descr => 'current wal flush location', +{ oid => '3330', descr => 'current wal flush lsn', proname => 'pg_current_wal_flush_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_flush_lsn' }, { oid => '2850', - descr => 'wal filename and byte offset, given a wal location', + descr => 'wal filename and byte offset, given a wal lsn', proname => 'pg_walfile_name_offset', prorettype => 'record', proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}', proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}', prosrc => 'pg_walfile_name_offset' }, -{ oid => '2851', descr => 'wal filename, given a wal location', +{ oid => '2851', descr => 'wal filename, given a wal lsn', proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn', prosrc => 'pg_walfile_name' }, @@ -6332,11 +6332,11 @@ proname => 'pg_is_in_recovery', provolatile => 'v', prorettype => 'bool', proargtypes => '', prosrc => 'pg_is_in_recovery' }, -{ oid => '3820', descr => 'current wal flush location', +{ oid => '3820', descr => 'current wal flush lsn', proname => 'pg_last_wal_receive_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_receive_lsn' }, -{ oid => '3821', descr => 'last wal replay location', +{ oid => '3821', descr => 'last wal replay lsn', proname => 'pg_last_wal_replay_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_replay_lsn' }, @@ -11515,7 +11515,7 @@ proparallel => 'r', prorettype => 'void', proargtypes => '', prosrc => 'pg_replication_origin_xact_reset' }, -{ oid => '6012', descr => 'advance replication origin to specific location', +{ oid => '6012', descr => 'advance replication origin to specific lsn', proname => 'pg_replication_origin_advance', provolatile => 'v', proparallel => 'u', prorettype => 'void', proargtypes => 'text pg_lsn', prosrc => 'pg_replication_origin_advance' }, -- 2.27.0
Uh.. Sorry, I forgot to change the subject. Resent with the correct subject. ===== Hello, this is a follow-up topic of [1] (add LSNs to checkpint logs). Many user-facing texts contains wording like "WAL location" or such like. The attached is WIP patches that fixes such wordings to "WAL LSN" or alikes. The attached patches is v13 but it is not changed at all from v12. My lastest comments on this version are as follows. https://www.postgresql.org/message-id/20220209.115204.1794224638476710282.horikyota.ntt@gmail.com > The old 0003 (attached 0004): > > > > +++ b/src/backend/access/rmgrdesc/xlogdesc.c > - appendStringInfo(buf, "redo %X/%X; " > + appendStringInfo(buf, "redo lsn %X/%X; " > > > > It is shown in the context of a checkpoint record, so I think it is > not needed or rather lengthning the dump line uselessly. > > > > +++ b/src/backend/access/transam/xlog.c > - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", > + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", > > > > +++ b/src/backend/replication/walsender.c > - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", > + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", > > > > "WAL" is upper-cased. So it seems rather strange that the "lsn" is > lower-cased. In the first place the message doesn't look like a > user-facing error message and I feel we don't need position or lsn > there.. > > > > +++ b/src/bin/pg_rewind/pg_rewind.c > - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", > + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", > > > > I feel that we don't need "WAL" there. > > > > +++ b/src/bin/pg_waldump/pg_waldump.c > - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); > + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); > > > > Mmm.. "WAL LSN RECPTR" looks strange to me. In the first place I > don't think "RECPTR" is a user-facing term. Doesn't something like the > follows work? > > > > + printf(_(" -e, --end=WAL-LSN stop reading at WAL-LSN\n")); > > > > In some changes in this patch shorten the main message text of > fprintf-ish functions. That makes the succeeding parameters can be > inlined. regards. [1] https://www.postgresql.org/message-id/20220316.091913.806120467943749797.horikyota.ntt%40gmail.com -- Kyotaro Horiguchi NTT Open Source Software Center From afed063f09b788bc07a0e2665499c3e3c3edc214 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 1 Feb 2022 03:57:04 +0000 Subject: [PATCH v13 1/2] Change "location" to "lsn" in pg_controldata --- src/bin/pg_controldata/pg_controldata.c | 10 +++++----- src/test/recovery/t/016_min_consistency.pl | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f911f98d94..59f39267df 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -235,9 +235,9 @@ main(int argc, char *argv[]) dbState(ControlFile->state)); printf(_("pg_control last modified: %s\n"), pgctime_str); - printf(_("Latest checkpoint location: %X/%X\n"), + printf(_("Latest checkpoint LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPoint)); - printf(_("Latest checkpoint's REDO location: %X/%X\n"), + printf(_("Latest checkpoint's REDO LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)); printf(_("Latest checkpoint's REDO WAL file: %s\n"), xlogfilename); @@ -274,13 +274,13 @@ main(int argc, char *argv[]) ckpttime_str); printf(_("Fake LSN counter for unlogged rels: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->unloggedLSN)); - printf(_("Minimum recovery ending location: %X/%X\n"), + printf(_("Minimum recovery ending LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint)); printf(_("Min recovery ending loc's timeline: %u\n"), ControlFile->minRecoveryPointTLI); - printf(_("Backup start location: %X/%X\n"), + printf(_("Backup start LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupStartPoint)); - printf(_("Backup end location: %X/%X\n"), + printf(_("Backup end LSN: %X/%X\n"), LSN_FORMAT_ARGS(ControlFile->backupEndPoint)); printf(_("End-of-backup record required: %s\n"), ControlFile->backupEndRequired ? _("yes") : _("no")); diff --git a/src/test/recovery/t/016_min_consistency.pl b/src/test/recovery/t/016_min_consistency.pl index 5e0655c2a9..4ee20309cd 100644 --- a/src/test/recovery/t/016_min_consistency.pl +++ b/src/test/recovery/t/016_min_consistency.pl @@ -126,7 +126,7 @@ my @control_data = split("\n", $stdout); my $offline_recovery_lsn = undef; foreach (@control_data) { - if ($_ =~ /^Minimum recovery ending location:\s*(.*)$/mg) + if ($_ =~ /^Minimum recovery ending LSN:\s*(.*)$/mg) { $offline_recovery_lsn = $1; last; -- 2.27.0 From 4cc657baf834bdab64b7aa4665962f7634ae73fb Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:49:16 +0900 Subject: [PATCH v13 2/2] Change "location" to "lsn" in user facing text --- src/backend/access/rmgrdesc/xlogdesc.c | 2 +- src/backend/access/transam/xlog.c | 4 ++-- src/backend/access/transam/xlogrecovery.c | 8 ++++---- src/backend/replication/repl_scanner.l | 2 +- src/backend/replication/walsender.c | 2 +- src/bin/pg_basebackup/pg_basebackup.c | 6 +++--- src/bin/pg_basebackup/pg_receivewal.c | 2 +- src/bin/pg_basebackup/pg_recvlogical.c | 4 ++-- src/bin/pg_basebackup/streamutil.c | 2 +- src/bin/pg_rewind/pg_rewind.c | 2 +- src/bin/pg_rewind/timeline.c | 2 +- src/bin/pg_waldump/pg_waldump.c | 14 +++++++------- src/include/catalog/pg_proc.dat | 16 ++++++++-------- 13 files changed, 33 insertions(+), 33 deletions(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index e7452af679..a7ffbd7590 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -44,7 +44,7 @@ xlog_desc(StringInfo buf, XLogReaderState *record) { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, "redo %X/%X; " + appendStringInfo(buf, "redo lsn %X/%X; " "tli %u; prev tli %u; fpw %s; xid %u:%u; oid %u; multi %u; offset %u; " "oldest xid %u in DB %u; oldest multi %u in DB %u; " "oldest/newest commit timestamp xid: %u/%u; " diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ed16f279b1..5e7516f67b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1471,7 +1471,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) if (upto > reservedUpto) { ereport(LOG, - (errmsg("request to flush past end of generated WAL; request %X/%X, current position %X/%X", + (errmsg("request to flush past end of generated WAL; request lsn %X/%X, current lsn %X/%X", LSN_FORMAT_ARGS(upto), LSN_FORMAT_ARGS(reservedUpto)))); upto = reservedUpto; } @@ -4916,7 +4916,7 @@ StartupXLOG(void) */ if (!XRecOffIsValid(ControlFile->checkPoint)) ereport(FATAL, - (errmsg("control file contains invalid checkpoint location"))); + (errmsg("control file contains invalid checkpoint LSN"))); switch (ControlFile->state) { diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index f9f212680b..dc822da402 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -530,7 +530,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, recoveryTargetName))); else if (recoveryTarget == RECOVERY_TARGET_LSN) ereport(LOG, - (errmsg("starting point-in-time recovery to WAL location (LSN) \"%X/%X\"", + (errmsg("starting point-in-time recovery to WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryTargetLSN)))); else if (recoveryTarget == RECOVERY_TARGET_IMMEDIATE) ereport(LOG, @@ -611,7 +611,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, if (!ReadRecord(xlogreader, LOG, false, checkPoint.ThisTimeLineID)) ereport(FATAL, - (errmsg("could not find redo location referenced by checkpoint record"), + (errmsg("could not find redo LSN referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add requiredrecovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoringfrom a backup.", @@ -2405,7 +2405,7 @@ recoveryStopsBefore(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping before WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping before WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } @@ -2568,7 +2568,7 @@ recoveryStopsAfter(XLogReaderState *record) recoveryStopTime = 0; recoveryStopName[0] = '\0'; ereport(LOG, - (errmsg("recovery stopping after WAL location (LSN) \"%X/%X\"", + (errmsg("recovery stopping after WAL LSN \"%X/%X\"", LSN_FORMAT_ARGS(recoveryStopLSN)))); return true; } diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l index 4b64c0d768..a0f1c44cc7 100644 --- a/src/backend/replication/repl_scanner.l +++ b/src/backend/replication/repl_scanner.l @@ -138,7 +138,7 @@ WAIT { return K_WAIT; } uint32 hi, lo; if (sscanf(yytext, "%X/%X", &hi, &lo) != 2) - yyerror("invalid streaming start location"); + yyerror("invalid streaming start LSN"); yylval.recptr = ((uint64) hi) << 32 | lo; return RECPTR; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 2d0292a092..95a043525f 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -813,7 +813,7 @@ StartReplication(StartReplicationCmd *cmd) if (FlushPtr < cmd->startpoint) { ereport(ERROR, - (errmsg("requested starting point %X/%X is ahead of the WAL flush position of this server %X/%X", + (errmsg("requested starting point %X/%X is ahead of the WAL flush LSN of this server %X/%X", LSN_FORMAT_ARGS(cmd->startpoint), LSN_FORMAT_ARGS(FlushPtr)))); } diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 2943d9ec1a..7951ae4312 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -488,7 +488,7 @@ reached_end_position(XLogRecPtr segendpos, uint32 timeline, if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } @@ -643,7 +643,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) /* Convert the starting position */ if (sscanf(startpos, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", startpos); exit(1); } @@ -2363,7 +2363,7 @@ BaseBackup(void) */ if (sscanf(xlogend, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", xlogend); exit(1); } diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index e2ceafeb0f..20425699af 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -753,7 +753,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index cc35d16f32..0bfb9edb94 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -781,7 +781,7 @@ main(int argc, char **argv) case 'I': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse start position \"%s\"", optarg); + pg_log_error("could not parse start LSN \"%s\"", optarg); exit(1); } startpos = ((uint64) hi) << 32 | lo; @@ -789,7 +789,7 @@ main(int argc, char **argv) case 'E': if (sscanf(optarg, "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse end position \"%s\"", optarg); + pg_log_error("could not parse end LSN \"%s\"", optarg); exit(1); } endpos = ((uint64) hi) << 32 | lo; diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 4a6afd1a06..41d0473138 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -447,7 +447,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, { if (sscanf(PQgetvalue(res, 0, 2), "%X/%X", &hi, &lo) != 2) { - pg_log_error("could not parse write-ahead log location \"%s\"", + pg_log_error("could not parse WAL LSN \"%s\"", PQgetvalue(res, 0, 2)); PQclear(res); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index b39b5c1aac..61e5c38629 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -344,7 +344,7 @@ main(int argc, char **argv) XLogRecPtr chkptendrec; findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); - pg_log_info("servers diverged at WAL location %X/%X on timeline %u", + pg_log_info("servers diverged at WAL LSN %X/%X on timeline %u", LSN_FORMAT_ARGS(divergerec), targetHistory[lastcommontliIndex].tli); diff --git a/src/bin/pg_rewind/timeline.c b/src/bin/pg_rewind/timeline.c index df8f82a50c..0b3c524f16 100644 --- a/src/bin/pg_rewind/timeline.c +++ b/src/bin/pg_rewind/timeline.c @@ -79,7 +79,7 @@ rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries) if (nfields != 3) { pg_log_error("syntax error in history file: %s", fline); - pg_log_error("Expected a write-ahead log switchpoint location."); + pg_log_error("Expected a WAL switchpoint LSN."); exit(1); } if (entries && tli <= lasttli) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index f128050b4e..5ed1942e24 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -764,7 +764,7 @@ usage(void) printf(_(" %s [OPTION]... [STARTSEG [ENDSEG]]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); - printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); + printf(_(" -e, --end=RECPTR stop reading at WAL LSN RECPTR\n")); printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); printf(_(" -n, --limit=N number of records to display\n")); printf(_(" -p, --path=PATH directory in which to find log segment files or a\n" @@ -773,7 +773,7 @@ usage(void) printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" " use --rmgr=list to list valid resource manager names\n")); - printf(_(" -s, --start=RECPTR start reading at WAL location RECPTR\n")); + printf(_(" -s, --start=RECPTR start reading at WAL LSN RECPTR\n")); printf(_(" -t, --timeline=TLI timeline from which to read log records\n" " (default: 1 or the value used in STARTSEG)\n")); printf(_(" -V, --version output version information, then exit\n")); @@ -882,7 +882,7 @@ main(int argc, char **argv) case 'e': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse end WAL location \"%s\"", + pg_log_error("could not parse end WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -934,7 +934,7 @@ main(int argc, char **argv) case 's': if (sscanf(optarg, "%X/%X", &xlogid, &xrecoff) != 2) { - pg_log_error("could not parse start WAL location \"%s\"", + pg_log_error("could not parse start WAL LSN \"%s\"", optarg); goto bad_argument; } @@ -1025,7 +1025,7 @@ main(int argc, char **argv) XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) { - pg_log_error("start WAL location %X/%X is not inside file \"%s\"", + pg_log_error("start WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.startptr), fname); goto bad_argument; @@ -1067,7 +1067,7 @@ main(int argc, char **argv) if (!XLByteInSeg(private.endptr, segno, WalSegSz) && private.endptr != (segno + 1) * WalSegSz) { - pg_log_error("end WAL location %X/%X is not inside file \"%s\"", + pg_log_error("end WAL LSN %X/%X is not inside file \"%s\"", LSN_FORMAT_ARGS(private.endptr), argv[argc - 1]); goto bad_argument; @@ -1079,7 +1079,7 @@ main(int argc, char **argv) /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) { - pg_log_error("no start WAL location given"); + pg_log_error("no start WAL LSN given"); goto bad_argument; } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index d8e8715ed1..3e83f68daf 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6299,24 +6299,24 @@ proname => 'pg_create_restore_point', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => 'text', prosrc => 'pg_create_restore_point' }, -{ oid => '2849', descr => 'current wal write location', +{ oid => '2849', descr => 'current wal write lsn', proname => 'pg_current_wal_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_lsn' }, -{ oid => '2852', descr => 'current wal insert location', +{ oid => '2852', descr => 'current wal insert lsn', proname => 'pg_current_wal_insert_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_insert_lsn' }, -{ oid => '3330', descr => 'current wal flush location', +{ oid => '3330', descr => 'current wal flush lsn', proname => 'pg_current_wal_flush_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_current_wal_flush_lsn' }, { oid => '2850', - descr => 'wal filename and byte offset, given a wal location', + descr => 'wal filename and byte offset, given a wal lsn', proname => 'pg_walfile_name_offset', prorettype => 'record', proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}', proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}', prosrc => 'pg_walfile_name_offset' }, -{ oid => '2851', descr => 'wal filename, given a wal location', +{ oid => '2851', descr => 'wal filename, given a wal lsn', proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn', prosrc => 'pg_walfile_name' }, @@ -6332,11 +6332,11 @@ proname => 'pg_is_in_recovery', provolatile => 'v', prorettype => 'bool', proargtypes => '', prosrc => 'pg_is_in_recovery' }, -{ oid => '3820', descr => 'current wal flush location', +{ oid => '3820', descr => 'current wal flush lsn', proname => 'pg_last_wal_receive_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_receive_lsn' }, -{ oid => '3821', descr => 'last wal replay location', +{ oid => '3821', descr => 'last wal replay lsn', proname => 'pg_last_wal_replay_lsn', provolatile => 'v', prorettype => 'pg_lsn', proargtypes => '', prosrc => 'pg_last_wal_replay_lsn' }, @@ -11515,7 +11515,7 @@ proparallel => 'r', prorettype => 'void', proargtypes => '', prosrc => 'pg_replication_origin_xact_reset' }, -{ oid => '6012', descr => 'advance replication origin to specific location', +{ oid => '6012', descr => 'advance replication origin to specific lsn', proname => 'pg_replication_origin_advance', provolatile => 'v', proparallel => 'u', prorettype => 'void', proargtypes => 'text pg_lsn', prosrc => 'pg_replication_origin_advance' }, -- 2.27.0
At Wed, 16 Mar 2022 09:19:13 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > In short, I split out the two topics other than checkpoint log to > other threads. So, this is about the main topic of this thread, adding LSNs to checkpint log. Other topics have moved to other treads [1], [2] , [3]. I think this is no longer controversial alone. So this patch is now really Read-for-Commiter and is waiting to be picked up. regards. [1] https://www.postgresql.org/message-id/20220316.102444.2193181487576617583.horikyota.ntt%40gmail.com [2] https://www.postgresql.org/message-id/20220316.102900.2003692961119672246.horikyota.ntt%40gmail.com [3] https://www.postgresql.org/message-id/20220316.102509.785466054344164656.horikyota.ntt%40gmail.com -- Kyotaro Horiguchi NTT Open Source Software Center From 4c3d40e7aaf29cb905f3561a291d35981d762456 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 4 Mar 2022 13:22:41 +0900 Subject: [PATCH v13] Add checkpoint and redo LSN to LogCheckpointEnd log message It is useful (for debugging purposes) if the checkpoint end message has the checkpoint LSN(end) and REDO LSN(start). It gives more context while analyzing checkpoint-related issues. The pg_controldata gives the last checkpoint LSN and REDO LSN, but having this info alongside the log message helps analyze issues that happened previously, connect the dots and identify the root cause. --- src/backend/access/transam/xlog.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ed16f279b1..b85c76d8f8 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6121,7 +6121,8 @@ LogCheckpointEnd(bool restartpoint) "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6134,14 +6135,21 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* + * ControlFileLock is not required as we are the only + * updator of these variables. + */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); else ereport(LOG, (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6154,7 +6162,13 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* + * ControlFileLock is not required as we are the only + * updator of these variables. + */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); } /* -- 2.27.0
This patch is currently showing up with a test failure in the CFBot however I do *not* believe this is a bug in the patch. I think it's a bug in that test which is being discussed elsewhere. It's also a very short and straightforward patch that a committer could probably make a decision about whether it's a good idea or not and then apply it quickly if so. Just to give people a leg up and an idea how short the patch is... Here's the entire patch: diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ed16f279b1..b85c76d8f8 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6121,7 +6121,8 @@ LogCheckpointEnd(bool restartpoint) "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6134,14 +6135,21 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* + * ControlFileLock is not required as we are the only + * updator of these variables. + */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); else ereport(LOG, (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); " "%d WAL file(s) added, %d removed, %d recycled; " "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; " "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; " - "distance=%d kB, estimate=%d kB", + "distance=%d kB, estimate=%d kB; " + "lsn=%X/%X, redo lsn=%X/%X", CheckpointStats.ckpt_bufs_written, (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers, CheckpointStats.ckpt_segs_added, @@ -6154,7 +6162,13 @@ LogCheckpointEnd(bool restartpoint) longest_msecs / 1000, (int) (longest_msecs % 1000), average_msecs / 1000, (int) (average_msecs % 1000), (int) (PrevCheckPointDistance / 1024.0), - (int) (CheckPointDistanceEstimate / 1024.0)))); + (int) (CheckPointDistanceEstimate / 1024.0), + /* + * ControlFileLock is not required as we are the only + * updator of these variables. + */ + LSN_FORMAT_ARGS(ControlFile->checkPoint), + LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo)))); } /* -- 2.27.0
On 2022/03/16 10:29, Kyotaro Horiguchi wrote: > At Wed, 16 Mar 2022 09:19:13 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in >> In short, I split out the two topics other than checkpoint log to >> other threads. > > So, this is about the main topic of this thread, adding LSNs to > checkpint log. Other topics have moved to other treads [1], [2] , > [3]. > > I think this is no longer controversial alone. So this patch is now > really Read-for-Commiter and is waiting to be picked up. +1 + * ControlFileLock is not required as we are the only + * updator of these variables. Isn't it better to add "at this time" or something at the end of the comment because only we're not always updator of them?No? Barring any objection, I'm thinking to apply the above small change and commit the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
At Thu, 7 Jul 2022 01:11:33 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in > > > On 2022/03/16 10:29, Kyotaro Horiguchi wrote: > > At Wed, 16 Mar 2022 09:19:13 +0900 (JST), Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote in > >> In short, I split out the two topics other than checkpoint log to > >> other threads. > > So, this is about the main topic of this thread, adding LSNs to > > checkpint log. Other topics have moved to other treads [1], [2] , > > [3]. > > I think this is no longer controversial alone. So this patch is now > > really Read-for-Commiter and is waiting to be picked up. > > +1 > > + * ControlFileLock is not required as we are the only > + * updator of these variables. > > Isn't it better to add "at this time" or something at the end of the > comment because only we're not always updator of them? No? Excluding initialization, (I believe) checkpointer is really the only updator of the variables/members. But I'm fine with the addition. > Barring any objection, I'm thinking to apply the above small change > and commit the patch. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2022/07/07 16:26, Kyotaro Horiguchi wrote: >> + * ControlFileLock is not required as we are the only >> + * updator of these variables. >> >> Isn't it better to add "at this time" or something at the end of the >> comment because only we're not always updator of them? No? > > Excluding initialization, (I believe) checkpointer is really the only > updator of the variables/members. But I'm fine with the addition. Ok, so I modified the patch slightly and pushed it. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION