Thread: Requiring recovery.signal or standby.signal when recovering with a backup_label
Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
Hi all, This is a follow-up of the point I have made a few weeks ago on this thread of pgsql-bugs about $subject: https://www.postgresql.org/message-id/Y/Q/17rpYS7YGbIt@paquier.xyz https://www.postgresql.org/message-id/Y/v0c+3W89NBT/if@paquier.xyz Here is a short summary of what I think is incorrect, and what I'd like to do to improve things moving forward, this pointing to a simple solution.. While looking at the so-said thread, I have dug into the recovery code to see what looks like an incorrect assumption behind the two boolean flags named ArchiveRecoveryRequested and InArchiveRecovery that we have in xlogrecovery.c to control the behavior of archive recovery in the startup process. For information, as of HEAD, these two are described as follows: /* * When ArchiveRecoveryRequested is set, archive recovery was requested, * ie. signal files were present. When InArchiveRecovery is set, we are * currently recovering using offline XLOG archives. These variables are only * valid in the startup process. * * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're * currently performing crash recovery using only XLOG files in pg_wal, but * will switch to using offline XLOG archives as soon as we reach the end of * WAL in pg_wal. */ bool ArchiveRecoveryRequested = false; bool InArchiveRecovery = false; When you read this text alone, its assumptions are simple. When the startup process finds a recovery.signal or a standby.signal, we switch ArchiveRecoveryRequested to true. If there is a standby.signal, InArchiveRecovery would be immediately set to true before beginning the redo loop. If we begin redo with a recovery.signal, ArchiveRecoveryRequested = true and InArchiveRecovery = false, crash recovery happens first, consuming all the WAL in pg_wal/, then we'd move on with archive recovery. Now comes the problem of the other thread, which is what happens when you use a backup_label *without* a recovery.signal or a standby.signal. In this case, as currently coded, it is possible to enforce ArchiveRecoveryRequested = false and later InArchiveRecovery = true. Not setting ArchiveRecoveryRequested has a couple of undesired effect. First, this skips some initialization steps that may be needed at a later point in recovery. The thread quoted above has reported one aspect of that: we miss some hot-standby related intialization that can reflect if replaying enough WAL that a restart point could happen. Depending on the amount of data copied into pg_wal/ before starting a node with only a backup_label it may also be possible that a consistent point has been reached, where restart points would be legit. A second Kiss Cool effect (understands who can), is that we miss the initialization of the recoveryWakeupLatch. A third effect is that some code paths can use GUC values related to recovery without ArchiveRecoveryRequested being around, one example seems to be hot_standby whose default is true. It is worth noting the end of FinishWalRecovery(), that includes this part: if (ArchiveRecoveryRequested) { /* * We are no longer in archive recovery state. * * We are now done reading the old WAL. Turn off archive fetching if * it was active. */ Assert(InArchiveRecovery); InArchiveRecovery = false; I have been pondering for a few weeks now about what kind of definition would suit to a cluster having a backup_label file without a signal file added, which is documented as required by the docs in the HA section as well as pg_rewind. It is true that there could be a point to allow such a configuration so as a node recovers without a TLI jump, but I cannot find appealing this case, as well, as a node could just happily overwrite WAL segments in the archives on an existing timeline, potentially corruption other nodes writing on the same TLI. There are a few other recovery scenarios where one copies directly WAL segments into pg_wal/ that can lead to a lot of weird inconsistencies as well, one being the report of the thread of pgsql-hackers. At the end, I'd like to think that we should just require a recovery.signal or a standby.signal if we have a backup_label file, and even enforce this rule at the end of recovery for some sanity checks. I don't think that we can just enforce ArchiveRecoveryRequested in this path, either, as a backup_label would be renamed to .old once the control file knows up to which LSN it needs to replay to reach consistency and if an end-of-backup record is required. That's not something that can be reasonably backpatched, as it could disturb some recovery workflows, even if these are kind of in a dangerous spot, IMO, so I would like to propose that only on HEAD for 16~ because the recovery code has never really considered this combination of ArchiveRecoveryRequested and InArchiveRecovery. While digging into that, I have found one TAP test of pg_basebackup that was doing recovery with just a backup_label file, with a restore_command already set. A second case was in pg_rewind, were we have a node without standby.signal, still it uses a primary_conninfo. Attached is a patch on the lines of what I am thinking about. This reworks a bit some of the actions at the beginning of the startup process: - Report the set of LOGs showing the state of the node after reading the backup_label. - Enforce a rule in ShutdownWalRecovery() and document the restriction. - Add a check with the signal files after finding a backup_label file. - The validation checks on the recovery parameters are applied (aka restore_command required with recovery.signal, or a primary_conninfo required on standby for streaming, etc.). My apologies for the long message, but this deserves some attention, IMHO. So, any thoughts? -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Fri, Mar 10, 2023 at 03:59:04PM +0900, Michael Paquier wrote: > My apologies for the long message, but this deserves some attention, > IMHO. Note: A CF entry has been added as of [1], and I have added an item in the list of live issues on the open item page for 16. [1]: https://commitfest.postgresql.org/43/4244/ [2]: https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Live_issues -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
David Zhang
Date:
I believe before users can make a backup using pg_basebackup and then start the backup as an independent Primary server for whatever reasons. Now, if this is still allowed, then users need to be aware that the backup_label must be manually deleted, otherwise, the backup won't be able to start as a Primary.
The current message below doesn't provide such a hint.
+ if (!ArchiveRecoveryRequested) + ereport(FATAL, + (errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"), + errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required recovery options.", + DataDir, DataDir)));
On 2023-03-12 6:06 p.m., Michael Paquier wrote:
On Fri, Mar 10, 2023 at 03:59:04PM +0900, Michael Paquier wrote:My apologies for the long message, but this deserves some attention, IMHO.Note: A CF entry has been added as of [1], and I have added an item in the list of live issues on the open item page for 16. [1]: https://commitfest.postgresql.org/43/4244/ [2]: https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Live_issues -- Michael
Best regards,
David
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Fri, Jul 14, 2023 at 01:32:49PM -0700, David Zhang wrote: > I believe before users can make a backup using pg_basebackup and then start > the backup as an independent Primary server for whatever reasons. Now, if > this is still allowed, then users need to be aware that the backup_label > must be manually deleted, otherwise, the backup won't be able to start as a > Primary. Delete a backup_label from a fresh base backup can easily lead to data corruption, as the startup process would pick up as LSN to start recovery from the control file rather than the backup_label file. This would happen if a checkpoint updates the redo LSN in the control file while a backup happens and the control file is copied after the checkpoint, for instance. If one wishes to deploy a new primary from a base backup, recovery.signal is the way to go, making sure that the new primary is bumped into a new timeline once recovery finishes, on top of making sure that the startup process starts recovery from a position where the cluster would be able to achieve a consistent state. > The current message below doesn't provide such a hint. > > + if (!ArchiveRecoveryRequested) > + ereport(FATAL, > + (errmsg("could not find > recovery.signal or standby.signal when recovering with > backup_label"), > + errhint("If you are restoring > from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" > and add required recovery options.", > + DataDir, > DataDir))); How would you rewrite that? I am not sure how many details we want to put here in terms of differences between recovery.signal and standby.signal, still we surely should mention these are the two possible choices. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
David Zhang
Date:
On 2023-07-16 6:27 p.m., Michael Paquier wrote: > > Delete a backup_label from a fresh base backup can easily lead to data > corruption, as the startup process would pick up as LSN to start > recovery from the control file rather than the backup_label file. > This would happen if a checkpoint updates the redo LSN in the control > file while a backup happens and the control file is copied after the > checkpoint, for instance. If one wishes to deploy a new primary from > a base backup, recovery.signal is the way to go, making sure that the > new primary is bumped into a new timeline once recovery finishes, on > top of making sure that the startup process starts recovery from a > position where the cluster would be able to achieve a consistent > state. Thanks a lot for sharing this information. > > How would you rewrite that? I am not sure how many details we want to > put here in terms of differences between recovery.signal and > standby.signal, still we surely should mention these are the two > possible choices. Honestly, I can't convince myself to mention the backup_label here too. But, I can share some information regarding my testing of the patch and the corresponding results. To assess the impact of the patch, I executed the following commands for before and after, pg_basebackup -h localhost -p 5432 -U david -D pg_backup1 pg_ctl -D pg_backup1 -l /tmp/logfile start Before the patch, there were no issues encountered when starting an independent Primary server. However, after applying the patch, I observed the following behavior when starting from the base backup: 1) simply start server from a base backup FATAL: could not find recovery.signal or standby.signal when recovering with backup_label HINT: If you are restoring from a backup, touch "/media/david/disk1/pg_backup1/recovery.signal" or "/media/david/disk1/pg_backup1/standby.signal" and add required recovery options. 2) touch a recovery.signal file and then try to start the server, the following error was encountered: FATAL: must specify restore_command when standby mode is not enabled 3) touch a standby.signal file, then the server successfully started, however, it operates in standby mode, whereas the intended behavior was for it to function as a primary server. Best regards, David
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Wed, Jul 19, 2023 at 11:21:17AM -0700, David Zhang wrote: > 1) simply start server from a base backup > > FATAL: could not find recovery.signal or standby.signal when recovering > with backup_label > > HINT: If you are restoring from a backup, touch > "/media/david/disk1/pg_backup1/recovery.signal" or > "/media/david/disk1/pg_backup1/standby.signal" and add required recovery > options. Note the difference when --write-recovery-conf is specified, where a standby.conf is created with a primary_conninfo in postgresql.auto.conf. So, yes, that's expected by default with the patch. > 2) touch a recovery.signal file and then try to start the server, the > following error was encountered: > > FATAL: must specify restore_command when standby mode is not enabled Yes, that's also something expected in the scope of the v1 posted. The idea behind this restriction is that specifying recovery.signal is equivalent to asking for archive recovery, but not specifying restore_command is equivalent to not provide any options to be able to recover. See validateRecoveryParameters() and note that this restriction exists since the beginning of times, introduced in commit 66ec2db. I tend to agree that there is something to be said about self-contained backups taken from pg_basebackup, though, as these would fail if no restore_command is specified, and this restriction is in place before Postgres has introduced replication and easier ways to have base backups. As a whole, I think that there is a good argument in favor of removing this restriction for the case where archive recovery is requested if users have all their WAL in pg_wal/ to be able to recover up to a consistent point, keeping these GUC restrictions if requesting a standby (not recovery.signal, only standby.signal). > 3) touch a standby.signal file, then the server successfully started, > however, it operates in standby mode, whereas the intended behavior was for > it to function as a primary server. standby.signal implies that the server will start in standby mode. If one wants to deploy a new primary, that would imply a timeline jump at the end of recovery, you would need to specify recovery.signal instead. We need more discussions and more opinions, but the discussion has stalled for a few months now. In case, I am adding Thomas Munro in CC who has mentioned to me at PGcon that he was interested in this patch (this thread's problem is not directly related to the fact that the checkpointer now runs in crash recovery, though). For now, I am attaching a rebased v2. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Bowen Shi
Date:
Thanks for the patch. I rerun the test in https://www.postgresql.org/message-id/flat/ZQtzcH2lvo8leXEr%40paquier.xyz#cc5ed83e0edc0b9a1c1305f08ff7a335 . We can discuss all the problems in this thread. First I encountered the problem " FATAL: could not find recovery.signal or standby.signal when recovering with backup_label ", then I deleted the backup_label file and started the instance successfully. > Delete a backup_label from a fresh base backup can easily lead to data > corruption, as the startup process would pick up as LSN to start > recovery from the control file rather than the backup_label file. > This would happen if a checkpoint updates the redo LSN in the control > file while a backup happens and the control file is copied after the > checkpoint, for instance. If one wishes to deploy a new primary from > a base backup, recovery.signal is the way to go, making sure that the > new primary is bumped into a new timeline once recovery finishes, on > top of making sure that the startup process starts recovery from a > position where the cluster would be able to achieve a consistent > state. ereport(FATAL, (errmsg("could not find redo location referenced by checkpoint record"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.", DataDir, DataDir, DataDir))); There are two similar error messages in xlogrecovery.c. Maybe we can modify the error messages to be similar. -- Bowen Shi On Thu, 21 Sept 2023 at 11:01, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Jul 19, 2023 at 11:21:17AM -0700, David Zhang wrote: > > 1) simply start server from a base backup > > > > FATAL: could not find recovery.signal or standby.signal when recovering > > with backup_label > > > > HINT: If you are restoring from a backup, touch > > "/media/david/disk1/pg_backup1/recovery.signal" or > > "/media/david/disk1/pg_backup1/standby.signal" and add required recovery > > options. > > Note the difference when --write-recovery-conf is specified, where a > standby.conf is created with a primary_conninfo in > postgresql.auto.conf. So, yes, that's expected by default with the > patch. > > > 2) touch a recovery.signal file and then try to start the server, the > > following error was encountered: > > > > FATAL: must specify restore_command when standby mode is not enabled > > Yes, that's also something expected in the scope of the v1 posted. > The idea behind this restriction is that specifying recovery.signal is > equivalent to asking for archive recovery, but not specifying > restore_command is equivalent to not provide any options to be able to > recover. See validateRecoveryParameters() and note that this > restriction exists since the beginning of times, introduced in commit > 66ec2db. I tend to agree that there is something to be said about > self-contained backups taken from pg_basebackup, though, as these > would fail if no restore_command is specified, and this restriction is > in place before Postgres has introduced replication and easier ways to > have base backups. As a whole, I think that there is a good argument > in favor of removing this restriction for the case where archive > recovery is requested if users have all their WAL in pg_wal/ to be > able to recover up to a consistent point, keeping these GUC > restrictions if requesting a standby (not recovery.signal, only > standby.signal). > > > 3) touch a standby.signal file, then the server successfully started, > > however, it operates in standby mode, whereas the intended behavior was for > > it to function as a primary server. > > standby.signal implies that the server will start in standby mode. If > one wants to deploy a new primary, that would imply a timeline jump at > the end of recovery, you would need to specify recovery.signal > instead. > > We need more discussions and more opinions, but the discussion has > stalled for a few months now. In case, I am adding Thomas Munro in CC > who has mentioned to me at PGcon that he was interested in this patch > (this thread's problem is not directly related to the fact that the > checkpointer now runs in crash recovery, though). > > For now, I am attaching a rebased v2. > -- > Michael
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Thu, Sep 21, 2023 at 11:45:06AM +0800, Bowen Shi wrote: > First I encountered the problem " FATAL: could not find > recovery.signal or standby.signal when recovering with backup_label ", > then I deleted the backup_label file and started the instance > successfully. Doing that is equal to corrupting your instance as recovery would begin from the latest redo LSN stored in the control file, but the physical relation files included in the backup may include blocks that require records that are needed before this redo LSN and the LSN stored in the backup_label. >> Delete a backup_label from a fresh base backup can easily lead to data >> corruption, as the startup process would pick up as LSN to start >> recovery from the control file rather than the backup_label file. >> This would happen if a checkpoint updates the redo LSN in the control >> file while a backup happens and the control file is copied after the >> checkpoint, for instance. If one wishes to deploy a new primary from >> a base backup, recovery.signal is the way to go, making sure that the >> new primary is bumped into a new timeline once recovery finishes, on >> top of making sure that the startup process starts recovery from a >> position where the cluster would be able to achieve a consistent >> state. And that's what I mean here. In more details. So, really, don't do that. > ereport(FATAL, > (errmsg("could not find redo location referenced by checkpoint record"), > errhint("If you are restoring from a backup, touch > \"%s/recovery.signal\" and add required recovery options.\n" > "If you are not restoring from a backup, try removing the file > \"%s/backup_label\".\n" > "Be careful: removing \"%s/backup_label\" will result in a corrupt > cluster if restoring from a backup.", > DataDir, DataDir, DataDir))); > > There are two similar error messages in xlogrecovery.c. Maybe we can > modify the error messages to be similar. The patch adds the following message, which is written this way to be consistent with the two others, already: + ereport(FATAL, + (errmsg("could not find recovery.signal or standby.signal when recovering with backup_label"), + errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and addrequired recovery options.", + DataDir, DataDir))); But you have an interesting point here, why isn't standby.signal also mentioned in the two existing comments? Depending on what's wanted by the user this can be equally useful to report back. Attached is a slightly updated patch, where I have also removed the check on ArchiveRecoveryRequested because the FATAL generated for !ArchiveRecoveryRequested makes sure that it is useless after reading the backup_label file. This patch has been around for a few months now. Do others have opinions about the direction taken here to make the presence of recovery.signal or standby.signal a hard requirement when a backup_label file is found (HEAD only)? -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Kyotaro Horiguchi
Date:
At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier <michael@paquier.xyz> wrote in > My apologies for the long message, but this deserves some attention, > IMHO. > > So, any thoughts? Sorry for being late. However, I agree with David's concern regarding the unnecessary inconvenience it introduces. I'd like to maintain the functionality. While I agree that InArchiveRecovery should be activated only if ArchiveReArchiveRecoveryRequested is true, I oppose to the notion that the mere presence of backup_label should be interpreted as a request for archive recovery (even if it is implied in a comment in InitWalRecovery()). Instead, I propose that we separate backup_label and archive recovery, in other words, we should not turn on InArchiveRecovery if !ArchiveRecoveryRequested, regardless of the presence of backup_label. We can know the minimum required recovery LSN by the XLOG_BACKUP_END record. The attached is a quick mock-up, but providing an approximation of my thoughts. (For example, end_of_backup_reached could potentially be extended to the ArchiveRecoveryRequested case and we could simplify the condition..) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Kyotaro Horiguchi
Date:
Sorry, it seems that I posted at the wrong position.. At Thu, 28 Sep 2023 12:58:51 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier <michael@paquier.xyz> wrote in > > My apologies for the long message, but this deserves some attention, > > IMHO. > > > > So, any thoughts? > > Sorry for being late. However, I agree with David's concern regarding > the unnecessary inconvenience it introduces. I'd like to maintain the > functionality. > > While I agree that InArchiveRecovery should be activated only if > ArchiveReArchiveRecoveryRequested is true, I oppose to the notion that > the mere presence of backup_label should be interpreted as a request > for archive recovery (even if it is implied in a comment in > InitWalRecovery()). Instead, I propose that we separate backup_label > and archive recovery, in other words, we should not turn on > InArchiveRecovery if !ArchiveRecoveryRequested, regardless of the > presence of backup_label. We can know the minimum required recovery > LSN by the XLOG_BACKUP_END record. > > The attached is a quick mock-up, but providing an approximation of my > thoughts. (For example, end_of_backup_reached could potentially be > extended to the ArchiveRecoveryRequested case and we could simplify > the condition..) regards -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Thu, Sep 28, 2023 at 12:58:51PM +0900, Kyotaro Horiguchi wrote: > The attached is a quick mock-up, but providing an approximation of my > thoughts. (For example, end_of_backup_reached could potentially be > extended to the ArchiveRecoveryRequested case and we could simplify > the condition..) I am not sure why this is related to this thread.. static XLogRecPtr backupStartPoint; static XLogRecPtr backupEndPoint; static bool backupEndRequired = false; +static bool backupEndReached = false; Anyway, sneaking at your suggestion, this is actually outlining the main issue I have with this code currently. We have so many static booleans to control one behavior over the other that we always try to make this code more complicated, while we should try to make it simpler instead. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
David Steele
Date:
On 9/27/23 23:58, Kyotaro Horiguchi wrote: > At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier <michael@paquier.xyz> wrote in >> My apologies for the long message, but this deserves some attention, >> IMHO. >> >> So, any thoughts? > > Sorry for being late. However, I agree with David's concern regarding > the unnecessary inconvenience it introduces. I'd like to maintain the > functionality. After some playing around, I find I agree with Michael on this, i.e. require at least standby.signal when a backup_label is present. According to my testing, you can preserve the "independent server" functionality by setting archive_command = /bin/false. In this case the timeline is not advanced and recovery proceeds from whatever is available in pg_wal. I think this type of recovery from a backup label without a timeline change should absolutely be the exception, not the default as it seems to be now. If the server is truly independent, then the timeline change is not important. If the server is not independent, then the timeline change is critical. So overall, +1 for Michael's patch, though I have only read through it and not tested it yet. One comment, though, if we are going to require recovery.signal when backup_label is present, should it just be implied? Why error and force the user to create it? Regards, -David
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Thu, Sep 28, 2023 at 04:23:42PM -0400, David Steele wrote: > After some playing around, I find I agree with Michael on this, i.e. require > at least standby.signal when a backup_label is present. > > According to my testing, you can preserve the "independent server" > functionality by setting archive_command = /bin/false. In this case the > timeline is not advanced and recovery proceeds from whatever is available in > pg_wal. I've seen folks depend on such setups in the past, actually, letting a process outside Postgres just "push" WAL segments to pg_wal instead of Postgres pulling it with a restore_command or a primary_conninfo for a standby. > I think this type of recovery from a backup label without a timeline change > should absolutely be the exception, not the default as it seems to be now. This can mess up archives pretty easily, additionally, so it's not something to encourage.. > If the server is truly independent, then the timeline change is not > important. If the server is not independent, then the timeline change is > critical. > > So overall, +1 for Michael's patch, though I have only read through it and > not tested it yet. Reviews, thoughts and opinions are welcome. > One comment, though, if we are going to require recovery.signal when > backup_label is present, should it just be implied? Why error and force the > user to create it? That's one thing I was considering, but I also cannot convince myself that this is the best option because the presence of recovery.signal or standby.standby (if both, standby.signal takes priority) makes it clear what type of recovery is wanted at disk level. I'd be OK if folks think that this is a sensible consensus, as well, even if I don't really agree with it. Another idea I had was to force the creation of recovery.signal by pg_basebackup even if -R is not used. All the reports we've seen with people getting confused came from pg_basebackup that enforces no configuration. A last thing, that had better be covered in a separate thread and patch, is about validateRecoveryParameters(). These days, I'd like to think that it may be OK to lift at least the restriction on restore_command being required if we are doing recovery to ease the case of self-contained backups (aka the case where all the WAL needed to reach a consistent point is in pg_wal/ or its tarball) -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
David Steele
Date:
On 9/28/23 19:59, Michael Paquier wrote: > On Thu, Sep 28, 2023 at 04:23:42PM -0400, David Steele wrote: >> >> So overall, +1 for Michael's patch, though I have only read through it and >> not tested it yet. > > Reviews, thoughts and opinions are welcome. OK, I have now reviewed and tested the patch and it looks good to me. I stopped short of marking this RfC since there are other reviewers in the mix. I dislike that we need to repeat: OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch); But I see the logic behind why you did it and there's no better way to do it as far as I can see. >> One comment, though, if we are going to require recovery.signal when >> backup_label is present, should it just be implied? Why error and force the >> user to create it? > > That's one thing I was considering, but I also cannot convince myself > that this is the best option because the presence of recovery.signal > or standby.standby (if both, standby.signal takes priority) makes it > clear what type of recovery is wanted at disk level. I'd be OK if > folks think that this is a sensible consensus, as well, even if I > don't really agree with it. I'm OK with keeping it as required for now. > Another idea I had was to force the creation of recovery.signal by > pg_basebackup even if -R is not used. All the reports we've seen with > people getting confused came from pg_basebackup that enforces no > configuration. This change makes it more obvious if configuration is missing (since you'll get an error), however +1 for adding this to pg_basebackup. > A last thing, that had better be covered in a separate thread and > patch, is about validateRecoveryParameters(). These days, I'd like to > think that it may be OK to lift at least the restriction on > restore_command being required if we are doing recovery to ease the > case of self-contained backups (aka the case where all the WAL needed > to reach a consistent point is in pg_wal/ or its tarball) Hmmm, I'm not sure about this. I'd prefer users set restore_command=/bin/false explicitly to fetch WAL from pg_wal by default if that's what they really intend. Regards, -David
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote: > On 9/28/23 19:59, Michael Paquier wrote: > OK, I have now reviewed and tested the patch and it looks good to me. I > stopped short of marking this RfC since there are other reviewers in the > mix. Thanks for the review. Yes, I am wondering if other people would chime in here. It doesn't feel like this has gathered enough opinions. Now this thread has been around for many months, and we've done quite a few changes in the backup APIs in the last few years with few users complaining back about them.. > I dislike that we need to repeat: > > OwnLatch(&XLogRecoveryCtl->recoveryWakeupLatch); > > But I see the logic behind why you did it and there's no better way to do it > as far as I can see. The main point is that there is no meaning in setting the latch until the backup_label file is read because if ArchiveRecoveryRequested is *not* set the startup process would outright fail as of the lack of [recovery|standby].signal. >> Another idea I had was to force the creation of recovery.signal by >> pg_basebackup even if -R is not used. All the reports we've seen with >> people getting confused came from pg_basebackup that enforces no >> configuration. > > This change makes it more obvious if configuration is missing (since you'll > get an error), however +1 for adding this to pg_basebackup. Looking at the streaming APIs of pg_basebackup, it looks like this would be a matter of using bbstreamer_inject_file() to inject an empty file into the stream. Still something seems to be off once compression methods are involved.. Hmm. I am not sure. Well, this could always be done as a patch independant of this one, under a separate discussion. There are extra arguments about whether it would be a good idea to add a recovery.signal even when taking a backup from a standby, and do that only in 17~. >> A last thing, that had better be covered in a separate thread and >> patch, is about validateRecoveryParameters(). These days, I'd like to >> think that it may be OK to lift at least the restriction on >> restore_command being required if we are doing recovery to ease the >> case of self-contained backups (aka the case where all the WAL needed >> to reach a consistent point is in pg_wal/ or its tarball) > > Hmmm, I'm not sure about this. I'd prefer users set > restore_command=/bin/false explicitly to fetch WAL from pg_wal by default if > that's what they really intend. It wouldn't be the first time we break compatibility in this area, so perhaps you are right and keeping this requirement is fine, even if it requires one extra step when recovering a self-contained backup generated by pg_basebackup. At least this forces users to look at their setup and check if something is wrong. We'd likely finish with a few "bug" reports, as well :D -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Bowen Shi
Date:
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed It looks good to me. I have reviewed the code and tested the patch with basic check-world test an pgbench test (metioned in https://www.postgresql.org/message-id/flat/ZQtzcH2lvo8leXEr%40paquier.xyz#cc5ed83e0edc0b9a1c1305f08ff7a335). Another reviewer has also approved it, so I change the status to RFC. The new status of this patch is: Ready for Committer
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Laurenz Albe
Date:
On Mon, 2023-10-16 at 14:54 +0900, Michael Paquier wrote: > Thanks for the review. Yes, I am wondering if other people would > chime in here. It doesn't feel like this has gathered enough > opinions. I don't have strong feelings either way. If you have backup_label but no signal file, starting PostgreSQL may succeed (if the WAL with the checkpoint happens to be in pg_wal) or it may fail with an error message. There is no danger of causing damage unless you remove backup_label, right? I cannot think of a use case where you use such a configuration on purpose, and the current error message is more crypric than a plain "you must have a signal file to start from a backup", so perhaps your patch is a good idea. Yours, Laurenz Albe
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Mon, Oct 16, 2023 at 05:48:43PM +0200, Laurenz Albe wrote: > I don't have strong feelings either way. If you have backup_label > but no signal file, starting PostgreSQL may succeed (if the WAL > with the checkpoint happens to be in pg_wal) or it may fail with > an error message. There is no danger of causing damage unless you > remove backup_label, right? A bit more happens currently if you have a backup_label with no signal files, unfortunately, because this causes some startup states to not be initialized. See around here: https://www.postgresql.org/message-id/Y/Q/17rpYS7YGbIt@paquier.xyz https://www.postgresql.org/message-id/Y/v0c+3W89NBT/if@paquier.xyz > I cannot think of a use case where you use such a configuration on > purpose, and the current error message is more crypric than a plain > "you must have a signal file to start from a backup", so perhaps > your patch is a good idea. I hope so. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Mon, Oct 16, 2023 at 02:54:35PM +0900, Michael Paquier wrote: > On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote: >> On 9/28/23 19:59, Michael Paquier wrote: >>> Another idea I had was to force the creation of recovery.signal by >>> pg_basebackup even if -R is not used. All the reports we've seen with >>> people getting confused came from pg_basebackup that enforces no >>> configuration. >> >> This change makes it more obvious if configuration is missing (since you'll >> get an error), however +1 for adding this to pg_basebackup. > > Looking at the streaming APIs of pg_basebackup, it looks like this > would be a matter of using bbstreamer_inject_file() to inject an empty > file into the stream. Still something seems to be off once > compression methods are involved.. Hmm. I am not sure. Well, this > could always be done as a patch independant of this one, under a > separate discussion. There are extra arguments about whether it would > be a good idea to add a recovery.signal even when taking a backup from > a standby, and do that only in 17~. Hmm. On this specific point, it would actually be much simpler to force recovery.signal to be in the contents streamed to a BASE_BACKUP. This does not step on your proposal at [1], though, because you'd still require a .signal file for recovery as far as I understand :/ [1]: https://www.postgresql.org/message-id/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b@pgmasters.net Would folks be OK to move on with the patch of this thread at the end? I am attempting a last-call kind of thing. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
David Steele
Date:
On 10/27/23 03:22, Michael Paquier wrote: > On Mon, Oct 16, 2023 at 02:54:35PM +0900, Michael Paquier wrote: >> On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote: >>> On 9/28/23 19:59, Michael Paquier wrote: >>>> Another idea I had was to force the creation of recovery.signal by >>>> pg_basebackup even if -R is not used. All the reports we've seen with >>>> people getting confused came from pg_basebackup that enforces no >>>> configuration. >>> >>> This change makes it more obvious if configuration is missing (since you'll >>> get an error), however +1 for adding this to pg_basebackup. >> >> Looking at the streaming APIs of pg_basebackup, it looks like this >> would be a matter of using bbstreamer_inject_file() to inject an empty >> file into the stream. Still something seems to be off once >> compression methods are involved.. Hmm. I am not sure. Well, this >> could always be done as a patch independant of this one, under a >> separate discussion. There are extra arguments about whether it would >> be a good idea to add a recovery.signal even when taking a backup from >> a standby, and do that only in 17~. > > Hmm. On this specific point, it would actually be much simpler to > force recovery.signal to be in the contents streamed to a BASE_BACKUP. That sounds like the right plan to me. Nice and simple. > This does not step on your proposal at [1], though, because you'd > still require a .signal file for recovery as far as I understand :/ > > [1]: https://www.postgresql.org/message-id/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b@pgmasters.net Yes. > Would folks be OK to move on with the patch of this thread at the end? > I am attempting a last-call kind of thing. I'm still +1 for the patch as it stands. Regards, -David
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Fri, Oct 27, 2023 at 09:31:10AM -0400, David Steele wrote: > That sounds like the right plan to me. Nice and simple. I'll tackle that in a separate thread with a patch registered for the upcoming CF of November. > I'm still +1 for the patch as it stands. I have been reviewing the patch, and applied portions of it as of dc5bd388 and 1ffdc03c and they're quite independent pieces. After that, the remaining bits of the patch to change the behavior is now straight-forward. I have written a commit message for it, while on it, as per the attached. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Roberto Mello
Date:
On Mon, Oct 30, 2023 at 1:09 AM Michael Paquier <michael@paquier.xyz> wrote:
I have been reviewing the patch, and applied portions of it as of
dc5bd388 and 1ffdc03c and they're quite independent pieces. After
that, the remaining bits of the patch to change the behavior is now
straight-forward. I have written a commit message for it, while on
it, as per the attached.
A suggestion for the hint message in an effort to improve readability:
"If you are restoring from a backup, ensure \"%s/recovery.signal\" or \"%s/standby.signal\" is present and add required recovery options."
I realize the original use of "touch" is a valid shortcut for what I suggest above, however that will be less clear for the not-so-un*x-inclined users of Postgres, while for some it'll be downright confusing, IMHO. It also provides the advantage of being crystal clear on what needs to be done to fix the problem.
Roberto
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Robert Haas
Date:
On Mon, Oct 30, 2023 at 3:09 AM Michael Paquier <michael@paquier.xyz> wrote: > I have been reviewing the patch, and applied portions of it as of > dc5bd388 and 1ffdc03c and they're quite independent pieces. After > that, the remaining bits of the patch to change the behavior is now > straight-forward. I have written a commit message for it, while on > it, as per the attached. I would encourage some caution here. In a vacuum, I'm in favor of this, and for the same reasons as you, namely, that the huge pile of Booleans that we use to control recovery is confusing, and it's difficult to make sure that all the code paths are adequately tested, and I think some of the things that actually work here are not documented. But in practice, I think there is a possibility of something like this backfiring very hard. Notice that the first two people who commented on the thread saw the error and immediately removed backup_label even though that's 100% wrong. It shows how utterly willing users are to remove backup_label for any reason or no reason at all. If we convert cases where things would have worked into cases where people nuke backup_label and then it appears to work, we're going to be worse off in the long run, no matter how crazy the idea of removing backup_label may seem to us. Also, Andres just recently mentioned to me that he uses this procedure of starting a server with a backup_label but no recovery.signal or standby.signal file regularly, and thinks other people do too. I was surprised, since I've never done that, except maybe when I was a noob and didn't have a clue. But Andres is far from a noob. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Andres Freund
Date:
Hi, On 2023-10-30 16:08:50 +0900, Michael Paquier wrote: > From 26a8432fe3ab8426e7797d85d19b0fe69d3384c9 Mon Sep 17 00:00:00 2001 > From: Michael Paquier <michael@paquier.xyz> > Date: Mon, 30 Oct 2023 16:02:52 +0900 > Subject: [PATCH v4] Require recovery.signal or standby.signal when reading a > backup_file > > Historically, the startup process uses two static variables to control > if archive recovery should happen, when either recovery.signal or > standby.signal are defined in the data folder at the beginning of > recovery: I think the problem with these variables is that they're a really messy state machine - something this patch doesn't meaningfully improve IMO. > This configuration was possible when recovering from a base backup taken > by pg_basebackup without -R. Note that the documentation requires at > least to set recovery.signal to restore from a backup, but the startup > process was not making this policy explicit. Maybe I just didn't check the right place, but from I saw, this, at most, is implied, rather than explicitly stated. > In most cases, one would have been able to complete recovery, but that's a > matter of luck, really, as it depends on the workload of the origin server. With -X ... we have all the necessary WAL locally, how does the workload on the primary matter? If you pass --no-slot, pg_basebackup might fail to fetch the necessary wal, but then you'd also have gotten an error. I agree with Robert that this would be a good error check on a green field, but that I am less convinced it's going to help more than hurt now. Right now running pg_basebackup with -X stream, without --write-recovery-conf, gives you a copy of a cluster that will come up correctly as a distinct instance. With this change applied, you need to know that the way to avoid the existing FATAL about restore_command at startup (when recovery.signal exists but restore_command isn't set)) is to is to set "restore_command = false", something we don't explain anywhere afaict. We should lessen the need to ever use restore_command, not increase it. It also seems risky to have people get used to restore_command = false, because that effectively disables detection of other timelines etc. But, this method does force a new timeline - which will be the same on each clone of the database... I also just don't think that it's always desirable to create a new timeline. Greetings, Andres Freund
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Mon, Oct 30, 2023 at 01:55:13PM -0400, Robert Haas wrote: > I would encourage some caution here. Thanks for chiming here. > In a vacuum, I'm in favor of this, and for the same reasons as you, > namely, that the huge pile of Booleans that we use to control recovery > is confusing, and it's difficult to make sure that all the code paths > are adequately tested, and I think some of the things that actually > work here are not documented. Yep, same feeling here. > But in practice, I think there is a possibility of something like this > backfiring very hard. Notice that the first two people who commented > on the thread saw the error and immediately removed backup_label even > though that's 100% wrong. It shows how utterly willing users are to > remove backup_label for any reason or no reason at all. If we convert > cases where things would have worked into cases where people nuke > backup_label and then it appears to work, we're going to be worse off > in the long run, no matter how crazy the idea of removing backup_label > may seem to us. As far as I know, there's one paragraph in the docs that implies this mode without giving an actual hint that this may be OK or not, so shrug: https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS "As with base backups, the easiest way to produce a standalone hot backup is to use the pg_basebackup tool. If you include the -X parameter when calling it, all the write-ahead log required to use the backup will be included in the backup automatically, and no special action is required to restore the backup." And a few lines down we imply to use restore_command, something that we check is set only if recovery.signal is set. See additionally validateRecoveryParameters(), where the comments imply that InArchiveRecovery would be set only when there's a restore command. As you're telling me, and I've considered that as an option as well, perhaps we should just consider the presence of a backup_label file with no .signal files as a synonym of crash recovery? In the recovery path, currently the essence of the problem is when we do InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning that it should do archive recovery but we don't want it, and that does not really make sense. The rest of the code sort of implies that this is not a suported combination. So basically, my suggestion here, is to just replay WAL up to the end of what's in your local pg_wal/ and hope for the best, without TLI jumps, except that we'd do nothing. Doing a pg_basebackup -X stream followed by a restart would work fine with that, because all the WAL is here. A point of contention is if we'd better be stricter about satisfying backupEndPoint in such a case, but the redo code only wants to do something here when ArchiveRecoveryRequested is set (aka there's a .signal file set), and we would not want a TLI jump at the end of recovery, so I don't see an argument with caring about backupEndPoint in this case. At the end, I'm OK as long as ArchiveRecoveryRequested=false InArchiveRecovery=true does not exist anymore, because it's much easier to get what's going on with the redo path, IMHO. (I have a patch at hand to show the idea, will post it with a reply to Andres' message.) > Also, Andres just recently mentioned to me that he uses this procedure > of starting a server with a backup_label but no recovery.signal or > standby.signal file regularly, and thinks other people do too. I was > surprised, since I've never done that, except maybe when I was a noob > and didn't have a clue. But Andres is far from a noob. At this stage, that's basically at your own risk, as the code thinks it's OK to force what's basically archive-recovery-without-being-it. So it basically works, but it can also easily backfire, as well.. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Mon, Oct 30, 2023 at 10:32:28AM -0600, Roberto Mello wrote: > I realize the original use of "touch" is a valid shortcut for what I > suggest above, however that will be less clear for the not-so-un*x-inclined > users of Postgres, while for some it'll be downright confusing, IMHO. It > also provides the advantage of being crystal clear on what needs to be done > to fix the problem. Indeed, "touch" may be better in this path if we'd throw an ERROR to enforce a given policy, and that's more consistent with the rest of the area. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Mon, Oct 30, 2023 at 12:47:41PM -0700, Andres Freund wrote: > I think the problem with these variables is that they're a really messy state > machine - something this patch doesn't meaningfully improve IMO. Okay. Yes, this is my root issue as well. We're at the stage where we should reduce the possible set of combinations and assumptions we're inventing because people can do undocumented stuff, then perhaps refactor the code on top of that (say, if one combination with too booleans is not possible, switch to a three-state enum rather than 2 bools, etc). >> This configuration was possible when recovering from a base backup taken >> by pg_basebackup without -R. Note that the documentation requires at >> least to set recovery.signal to restore from a backup, but the startup >> process was not making this policy explicit. > > Maybe I just didn't check the right place, but from I saw, this, at most, is > implied, rather than explicitly stated. See the doc reference here: https://www.postgresql.org/message-id/ZUBM6BNQnEh7lzIM@paquier.xyz So it kind of implies it, still also mentions restore_command. It's like Schrödinger's cat, yes and no at the same time. > With -X ... we have all the necessary WAL locally, how does the workload on > the primary matter? If you pass --no-slot, pg_basebackup might fail to fetch > the necessary wal, but then you'd also have gotten an error. > > [...] > > Right now running pg_basebackup with -X stream, without --write-recovery-conf, > gives you a copy of a cluster that will come up correctly as a distinct > instance. > > [...] > > I also just don't think that it's always desirable to create a new timeline. Yeah. Another argument I was mentioning to Robert is that we may want to just treat the case where you have a backup_label without any signal files just the same as crash recovery, replaying all the local pg_wal/, and nothing else. For example, something like the attached should make sure that InArchiveRecovery=true should never be set if ArchiveRecoveryRequested is not set. The attached would still cause redo to complain on a "WAL ends before end of online backup" if not all the WAL is here (reason behind the tweak of 010_pg_basebackup.pl, but the previous tweak to pg_rewind's 008_min_recovery_point.pl is not required here). Attached is the idea I had in mind, in terms of code, FWIW. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Robert Haas
Date:
On Mon, Oct 30, 2023 at 8:40 PM Michael Paquier <michael@paquier.xyz> wrote: > As far as I know, there's one paragraph in the docs that implies this > mode without giving an actual hint that this may be OK or not, so > shrug: > https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS > "As with base backups, the easiest way to produce a standalone hot > backup is to use the pg_basebackup tool. If you include the -X > parameter when calling it, all the write-ahead log required to use the > backup will be included in the backup automatically, and no special > action is required to restore the backup." I see your point, but that's way too subtle. As far as I know, the only actually-documented procedure for restoring is this one: https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY That procedure actually is badly in need of some updating, IMHO, because close to half of it is about moving your existing database cluster out of the way, which may or may not be needed in the case of any particular backup restore. Also, it unconditionally mentions creating recovery.signal, with no mention of standby.signal. And certainly not with neither. It also gives zero motivation for actually doing this and says nothing useful about backup_label. Both recovery.signal and standby.signal are documented in https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY but you'd have no real reason to look in a list of GUCs for information about a file on disk. recovery.signal but not standby.signal is mentioned in https://www.postgresql.org/docs/current/warm-standby.html but nowhere that I can find do we explicitly talk about running with at least one of them. > As you're telling me, and I've considered that as an option as well, > perhaps we should just consider the presence of a backup_label file > with no .signal files as a synonym of crash recovery? In the recovery > path, currently the essence of the problem is when we do > InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning > that it should do archive recovery but we don't want it, and that does > not really make sense. The rest of the code sort of implies that this > is not a suported combination. So basically, my suggestion here, is > to just replay WAL up to the end of what's in your local pg_wal/ and > hope for the best, without TLI jumps, except that we'd do nothing. This sentence seems to be incomplete. But I was not saying we should treat the case where we have a backup_label file like crash recovery. The real question here is why we don't treat it fully like archive recovery. I don't know off-hand what is different if I start the server with both backup_label and recovery.signal vs. if I start it with only backup_label, but I question whether there should be any difference at all. > A point of contention is if we'd better be stricter about satisfying > backupEndPoint in such a case, but the redo code only wants to do > something here when ArchiveRecoveryRequested is set (aka there's a > .signal file set), and we would not want a TLI jump at the end of > recovery, so I don't see an argument with caring about backupEndPoint > in this case. This is a bit hard for me to understand, but I disagree strongly with the idea that we should ever ignore a backup end point if we have one. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Tue, Oct 31, 2023 at 08:28:07AM -0400, Robert Haas wrote: > On Mon, Oct 30, 2023 at 8:40 PM Michael Paquier <michael@paquier.xyz> wrote: >> As far as I know, there's one paragraph in the docs that implies this >> mode without giving an actual hint that this may be OK or not, so >> shrug: >> https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS >> "As with base backups, the easiest way to produce a standalone hot >> backup is to use the pg_basebackup tool. If you include the -X >> parameter when calling it, all the write-ahead log required to use the >> backup will be included in the backup automatically, and no special >> action is required to restore the backup." > > I see your point, but that's way too subtle. As far as I know, the > only actually-documented procedure for restoring is this one: > https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY > > That procedure actually is badly in need of some updating, IMHO, > because close to half of it is about moving your existing database > cluster out of the way, which may or may not be needed in the case of > any particular backup restore. Also, it unconditionally mentions > creating recovery.signal, with no mention of standby.signal. And > certainly not with neither. It also gives zero motivation for actually > doing this and says nothing useful about backup_label. > > Both recovery.signal and standby.signal are documented in > https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY > but you'd have no real reason to look in a list of GUCs for > information about a file on disk. recovery.signal but not > standby.signal is mentioned in > https://www.postgresql.org/docs/current/warm-standby.html but nowhere > that I can find do we explicitly talk about running with at least one > of them. Point 7. of what you quote says to use one? True that this needs a refresh, and perhaps a bit fat warning about the fact that these are required if you want to fetch WAL from other sources than the local pg_wal/. Perhaps there may be a point of revisiting the default behavior of recovery_target_timeline in this case, I don't know. >> As you're telling me, and I've considered that as an option as well, >> perhaps we should just consider the presence of a backup_label file >> with no .signal files as a synonym of crash recovery? In the recovery >> path, currently the essence of the problem is when we do >> InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning >> that it should do archive recovery but we don't want it, and that does >> not really make sense. The rest of the code sort of implies that this >> is not a suported combination. So basically, my suggestion here, is >> to just replay WAL up to the end of what's in your local pg_wal/ and >> hope for the best, without TLI jumps, except that we'd do nothing. > > This sentence seems to be incomplete. I've re-read it, and it looks OK to me. What I mean with this paragraph are two things: - Remove InArchiveRecovery=true and ArchiveRecoveryRequested=false as a possible combination in the code. - Treat backup_label with no .signal file as the same as crash recovery, that: -- Does no TLI jump at the end of recovery. -- Expects all the WAL to be in pg_wal/. > But I was not saying we should treat the case where we have a > backup_label file like crash recovery. The real question here is why > we don't treat it fully like archive recovery. Timeline jump at the end of recovery? Archive recovery forces a TLI jump by default at the end of redo if there's a signal file, and some users may not want a TLI jump by default? > I don't know off-hand > what is different if I start the server with both backup_label and > recovery.signal vs. if I start it with only backup_label, but I > question whether there should be any difference at all. Perhaps we could do that, but note that backup_label is renamed to backup_label.old at the beginning of redo. The code has historically always enforced InArchiveRecovery=true when there's a backup label, and InArchiveRecovery=false where there is no backup label, so we don't get the same recovery behavior if a cluster is restarted while it was still performing recovery. I don't quite see how it is possible to make this code simpler without enforcing a policy to take care of this inconsistency. I've listed two of them on this thread: - Force the presence of a .recovery file when there is a backup_label, to force archive recovery. - Force crash recovery if there are no signal files but a backup_label, then a restart of a cluster that began a restore while it processed a backup would be confused: should it do crash recovery or archive recovery? My guess, based on what I read from the feedback of this thread, is that it could be more helpful to do the second thing, not the third one, because this is better with standalone backups: no TLI jumps and restore happens with all the local WAL in pg_wal/, without any GUCs to control how recovery should run. You are suggesting a third, hybrid, approach. Now note we have always checked for signal files before the backup_label. Recovery GUCs are checked only if there's one of the two signal files. It seems to me that what you are suggesting would make the code a bit harder to follow, actually, and more inconsistent with stable branches because we would need to check the control file contents *before* checking for the .signal files or backup_label to be able to see if archive recovery *should* happen, depending on if there's a backupEndPoint. >> A point of contention is if we'd better be stricter about satisfying >> backupEndPoint in such a case, but the redo code only wants to do >> something here when ArchiveRecoveryRequested is set (aka there's a >> .signal file set), and we would not want a TLI jump at the end of >> recovery, so I don't see an argument with caring about backupEndPoint >> in this case. > > This is a bit hard for me to understand, but I disagree strongly with > the idea that we should ever ignore a backup end point if we have one. Actually, while experimenting yesterday before sending my reply to you, I have noticed that redo cares about backupEndPoint even if you force crash recovery when there's only a backup_label file. There's a case in pg_basebackup that would fail, but that's accidental, AFAIK: https://www.postgresql.org/message-id/ZUBVKfL6FR6NOQyt%40paquier.xyz See in StartupXLOG(), around the comment "complain if we did not roll forward far enough to reach". This complains if archive recovery has been requested *or* if we retrieved a backup end LSN from the backup_label. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Kyotaro Horiguchi
Date:
At Wed, 1 Nov 2023 08:39:17 +0900, Michael Paquier <michael@paquier.xyz> wrote in > See in StartupXLOG(), around the comment "complain if we did not roll > forward far enough to reach". This complains if archive recovery has > been requested *or* if we retrieved a backup end LSN from the > backup_label. Please note that backupStartPoint is not reset even when reaching the backup end point during crash recovery. If backup_label enforces archive recovery, I think this point won't be an issue as you mentioned. For the record, my earlier proposal aimed to detect reaching the end point even during crash recovery. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Thu, Nov 02, 2023 at 11:03:35AM +0900, Kyotaro Horiguchi wrote: > At Wed, 1 Nov 2023 08:39:17 +0900, Michael Paquier <michael@paquier.xyz> wrote in >> See in StartupXLOG(), around the comment "complain if we did not roll >> forward far enough to reach". This complains if archive recovery has >> been requested *or* if we retrieved a backup end LSN from the >> backup_label. > > Please note that backupStartPoint is not reset even when reaching the > backup end point during crash recovery. If backup_label enforces > archive recovery, I think this point won't be an issue as you > mentioned. For the record, my earlier proposal aimed to detect > reaching the end point even during crash recovery. Good point. Not doing ReachedEndOfBackup() at the end of crash recovery feels inconsistent, especially since we care about some of these fields in this case. If a .signal file is required when we read a backup_label, yes that would not be a problem because we'd always link backupEndPoint's destiny with a requested archive recovery, but there seem to be little love for enforcing that based on the feedback of this thread, so.. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Robert Haas
Date:
On Tue, Oct 31, 2023 at 7:39 PM Michael Paquier <michael@paquier.xyz> wrote: > Point 7. of what you quote says to use one? True that this needs a > refresh, and perhaps a bit fat warning about the fact that these are > required if you want to fetch WAL from other sources than the local > pg_wal/. Perhaps there may be a point of revisiting the default > behavior of recovery_target_timeline in this case, I don't know. I don't really know what to say to this -- sure, point 7 of "Recovering Using a Continuous Archive Backup" says to use recovery.signal. But as I said in the preceding paragraph, it doesn't say either "use recovery.signal or standby.signal". Nor does it or anything else in the documentation explain under what circumstances you're allowed to have neither. So the whole thing is very unclear. > >> As you're telling me, and I've considered that as an option as well, > >> perhaps we should just consider the presence of a backup_label file > >> with no .signal files as a synonym of crash recovery? In the recovery > >> path, currently the essence of the problem is when we do > >> InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning > >> that it should do archive recovery but we don't want it, and that does > >> not really make sense. The rest of the code sort of implies that this > >> is not a suported combination. So basically, my suggestion here, is > >> to just replay WAL up to the end of what's in your local pg_wal/ and > >> hope for the best, without TLI jumps, except that we'd do nothing. > > > > This sentence seems to be incomplete. > > I've re-read it, and it looks OK to me. Well, the sentence ends with "except that we'd do nothing" and I don't know what that means. It would make sense to me if it said "except that we'd do nothing about <whatever>" or "except that we'd do nothing instead of <something>" but as you've written it basically seems to boil down to "my suggestion is to replay WAL except do nothing" which makes no sense. If you replay WAL, you're not doing nothing. > > But I was not saying we should treat the case where we have a > > backup_label file like crash recovery. The real question here is why > > we don't treat it fully like archive recovery. > > Timeline jump at the end of recovery? Archive recovery forces a TLI > jump by default at the end of redo if there's a signal file, and some > users may not want a TLI jump by default? Uggh. I don't know what to think about that. I bet some people do want that, but that makes it pretty easy to end up with multiple copies of the same cluster running on the same TLI, too, which is not a thing that you really want to have happen. At the end of the day, I'm coming around to the view that the biggest problem here is the documentation. Nobody can really know what's supposed to work right now because the documentation doesn't say which things you are and are not allowed to do and what results you should expect in each case. If it did, it would be easier to discuss possible behavior changes. Right now, it's hard to change any code at all, because there's no list of supported scenarios, so you can't tell whether a potential change affects a scenario that somebody thinks should work, or only cases that nobody can possibly care about. It's sort of possible to reason your way through that, to an extent, but it's pretty hard. The fact that I didn't know that starting from a backup with neither recovery.signal nor standby.signal was a thing that anybody did or cared about is good evidence of that. I'm coming to the understanding that we have four supported scenarios. One, no backup_label, no recovery.signal, and no standby.signal. Hence, replay WAL until the end, then start up. Two, backup_label exists but neither recovery.signal nor standby.signal does. As before, but if I understand correctly, now we can check that we reached the backup end location. Three, recovery.signal exists, with or without backup_label. Now we create a new TLI at the end of recovery, and also, now can fetch WAL that is not present in pg_wal using primary_conninfo or restore_command. In fact, I think we may prefer to do that over using WAL we have locally, but I'm not quite sure about that. Fourth, standby.signal exists, with or without backup_label. As the previous scenario, but now when we reach the end of WAL we wait for more to appear instead of ending recovery. I have a feeling this is not quite an exhaustive list of differences between the various modes, and I'm not even sure that it lists all of the things someone might try to do. Thoughts? I also feel like the terminology here sometimes obscures more than it illuminates. For instance, it seems like ArchiveRecoveryRequested really means "are any signal files present?" while InArchiveRecovery means "are we fetching WAL from outside pg_wal rather than using what's in pg_wal?". But these are not obvious from the names, and sometimes we have additional variables with overlapping meanings, like readSource, which indicates whether we're reading from pg_wal, the archive, or the walreceiver, and yet is probably not redundant with InArchiveRecovery. In any event, I think that we need to start with the question of what behavior(s) we want to expose to users, and then back into the question of what internal variables and states need to exist in order to support that behavior. We cannot start by deciding what variables we'd like to get rid of and then trying to justify the resulting behavior changes on the grounds that they simplify the code. Users aren't going to like that, hackers aren't going to like that, and the resulting behavior probably won't be anything great. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Wed, Nov 08, 2023 at 01:16:58PM -0500, Robert Haas wrote: > On Tue, Oct 31, 2023 at 7:39 PM Michael Paquier <michael@paquier.xyz> wrote: >>>> As you're telling me, and I've considered that as an option as well, >>>> perhaps we should just consider the presence of a backup_label file >>>> with no .signal files as a synonym of crash recovery? In the recovery >>>> path, currently the essence of the problem is when we do >>>> InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning >>>> that it should do archive recovery but we don't want it, and that does >>>> not really make sense. The rest of the code sort of implies that this >>>> is not a suported combination. So basically, my suggestion here, is >>>> to just replay WAL up to the end of what's in your local pg_wal/ and >>>> hope for the best, without TLI jumps, except that we'd do nothing. >>> >>> This sentence seems to be incomplete. >> >> I've re-read it, and it looks OK to me. > > Well, the sentence ends with "except that we'd do nothing" and I don't > know what that means. It would make sense to me if it said "except > that we'd do nothing about <whatever>" or "except that we'd do nothing > instead of <something>" but as you've written it basically seems to > boil down to "my suggestion is to replay WAL except do nothing" which > makes no sense. If you replay WAL, you're not doing nothing. Sure, sorry for the confusion. By "we'd do nothing", I mean precirely "to take no specific action related to archive recovery and recovery parameters at the end of recovery", meaning that a combination of backup_label with no signal file would be the same as crash recovery, replaying WAL up to the end of what can be found in pg_wal/, and only that. >>> But I was not saying we should treat the case where we have a >>> backup_label file like crash recovery. The real question here is why >>> we don't treat it fully like archive recovery. >> >> Timeline jump at the end of recovery? Archive recovery forces a TLI >> jump by default at the end of redo if there's a signal file, and some >> users may not want a TLI jump by default? > > Uggh. I don't know what to think about that. I bet some people do want > that, but that makes it pretty easy to end up with multiple copies of > the same cluster running on the same TLI, too, which is not a thing > that you really want to have happen. Andres has mentioned upthread that this is something he's been using to quickly be able to clone a cluster. I would not recommend doing that, personally, but if that's useful in some cases, well, why not. > At the end of the day, I'm coming around to the view that the biggest > problem here is the documentation. Nobody can really know what's > supposed to work right now because the documentation doesn't say which > things you are and are not allowed to do and what results you should > expect in each case. If it did, it would be easier to discuss possible > behavior changes. Right now, it's hard to change any code at all, > because there's no list of supported scenarios, so you can't tell > whether a potential change affects a scenario that somebody thinks > should work, or only cases that nobody can possibly care about. It's > sort of possible to reason your way through that, to an extent, but > it's pretty hard. The fact that I didn't know that starting from a > backup with neither recovery.signal nor standby.signal was a thing > that anybody did or cared about is good evidence of that. That's one problem, not all of it, because the code takes extra assumptions around that. > I also feel like the terminology here sometimes obscures more than it > illuminates. For instance, it seems like ArchiveRecoveryRequested > really means "are any signal files present?" while InArchiveRecovery > means "are we fetching WAL from outside pg_wal rather than using > what's in pg_wal?". But these are not obvious from the names, and > sometimes we have additional variables with overlapping meanings, like > readSource, which indicates whether we're reading from pg_wal, the > archive, or the walreceiver, and yet is probably not redundant with > InArchiveRecovery. In any event, I think that we need to start with > the question of what behavior(s) we want to expose to users, and then > back into the question of what internal variables and states need to > exist in order to support that behavior. We cannot start by deciding > what variables we'd like to get rid of and then trying to justify the > resulting behavior changes on the grounds that they simplify the code. > Users aren't going to like that, hackers aren't going to like that, > and the resulting behavior probably won't be anything great. Note as well that InArchiveRecovery is set when there's a backup_label, but that the code would check for the existence of a restore_command only if a signal file exists. That's strange, but if people have been relying on this behavior, so be it. At this stage, it looks pretty clear to me that there's no consensus on what to do, and nobody's happy with the proposal of this thread, so I am going to mark it as rejected. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote: > Sure, sorry for the confusion. By "we'd do nothing", I mean precirely > "to take no specific action related to archive recovery and recovery > parameters at the end of recovery", meaning that a combination of > backup_label with no signal file would be the same as crash recovery, > replaying WAL up to the end of what can be found in pg_wal/, and only > that. By being slightly more precise. I also mean to fail recovery if it is not possible to replay up to the end-of-backup LSN marked in the label file because we are missing some stuff in pg_wal/, which is something that the code is currently able to handle. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Andres Freund
Date:
Hi, On 2023-11-09 12:16:52 +0900, Michael Paquier wrote: > On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote: > > Sure, sorry for the confusion. By "we'd do nothing", I mean precirely > > "to take no specific action related to archive recovery and recovery > > parameters at the end of recovery", meaning that a combination of > > backup_label with no signal file would be the same as crash recovery, > > replaying WAL up to the end of what can be found in pg_wal/, and only > > that. I don't think those are equivalent - in the "backup_label with no signal file" case we start recovery at a different location than the "crash recovery" case does. > By being slightly more precise. I also mean to fail recovery if it is > not possible to replay up to the end-of-backup LSN marked in the label > file because we are missing some stuff in pg_wal/, which is something > that the code is currently able to handle. "able to handle" as in detect and error out? Because that's the only possible sane thing to do, correct? Greetings, Andres Freund
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Michael Paquier
Date:
On Mon, Nov 13, 2023 at 03:41:44PM -0800, Andres Freund wrote: > On 2023-11-09 12:16:52 +0900, Michael Paquier wrote: >> On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote: >> > Sure, sorry for the confusion. By "we'd do nothing", I mean precirely >> > "to take no specific action related to archive recovery and recovery >> > parameters at the end of recovery", meaning that a combination of >> > backup_label with no signal file would be the same as crash recovery, >> > replaying WAL up to the end of what can be found in pg_wal/, and only >> > that. > > I don't think those are equivalent - in the "backup_label with no signal file" > case we start recovery at a different location than the "crash recovery" case > does. It depends on how you see things, and based on my read of the thread or the code we've never really put a clear definition what a "backup_label with no signal file" should do. The definition I was suggesting is to make it work the same way as crash recovery internally: - use the start LSN from the backup_label. - replay up to the end of local WAL. - don't rely on any recovery GUCs. - if at the end of recovery replay has not reached the end-of-backup record, then fail. >> By being slightly more precise. I also mean to fail recovery if it is >> not possible to replay up to the end-of-backup LSN marked in the label >> file because we are missing some stuff in pg_wal/, which is something >> that the code is currently able to handle. > > "able to handle" as in detect and error out? Because that's the only possible > sane thing to do, correct? By "able to handle", I mean to detect that the expected LSN has not been reached and FATAL, or fail recovery. So yes. -- Michael
Attachment
Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
From
Andres Freund
Date:
Hi, On 2023-11-14 09:13:44 +0900, Michael Paquier wrote: > On Mon, Nov 13, 2023 at 03:41:44PM -0800, Andres Freund wrote: > > On 2023-11-09 12:16:52 +0900, Michael Paquier wrote: > >> On Thu, Nov 09, 2023 at 12:04:19PM +0900, Michael Paquier wrote: > >> > Sure, sorry for the confusion. By "we'd do nothing", I mean precirely > >> > "to take no specific action related to archive recovery and recovery > >> > parameters at the end of recovery", meaning that a combination of > >> > backup_label with no signal file would be the same as crash recovery, > >> > replaying WAL up to the end of what can be found in pg_wal/, and only > >> > that. > > > > I don't think those are equivalent - in the "backup_label with no signal file" > > case we start recovery at a different location than the "crash recovery" case > > does. > > It depends on how you see things, and based on my read of the thread > or the code we've never really put a clear definition what a > "backup_label with no signal file" should do. The definition I was > suggesting is to make it work the same way as crash recovery > internally: > - use the start LSN from the backup_label. That's fundamentally different from crash recovery! > - replay up to the end of local WAL. > - don't rely on any recovery GUCs. > - if at the end of recovery replay has not reached the end-of-backup > record, then fail. Also different from crash recovery. It doesn't make sense to me to say "work the same way" when there are such fundamental differences. Greetings, Andres Freund