Thread: Re: pgsql: Integrate recovery.conf into postgresql.conf
On Sun, Nov 25, 2018 at 03:49:23PM +0000, Peter Eisentraut wrote: > Integrate recovery.conf into postgresql.conf > > recovery.conf settings are now set in postgresql.conf (or other GUC > sources). Currently, all the affected settings are PGC_POSTMASTER; > this could be refined in the future case by case. The buildfarm is unhappy after this one: https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=culicidae&br=HEAD If I use -DEXEC_BACKEND when compiling the tests complain about a timeout in 003_recovery_targets. Here is what the buildfarm reports, I can see the failure by myself as well: # Postmaster PID for node "standby_6" is 26668 # poll_query_until timed out executing this query: # SELECT '0/3022690'::pg_lsn <= pg_last_wal_replay_lsn() # expecting this output: # t # last actual query output: # f # with stderr: Timed out while waiting for standby to catch up at t/003_recovery_targets.pl line 34. Peter, could you look at that? -- Michael
Attachment
Hi > The buildfarm is unhappy after this one: > https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=culicidae&br=HEAD > > If I use -DEXEC_BACKEND when compiling the tests complain about a > timeout in 003_recovery_targets. Here is what the buildfarm reports, I > can see the failure by myself as well: > # Postmaster PID for node "standby_6" is 26668 > # poll_query_until timed out executing this query: > # SELECT '0/3022690'::pg_lsn <= pg_last_wal_replay_lsn() > # expecting this output: > # t > # last actual query output: > # f > # with stderr: > Timed out while waiting for standby to catch up at > t/003_recovery_targets.pl line 34. I can reproduce this and notice wrong assign settings order. For example standby_6 has > recovery_target_name = '$recovery_name' > recovery_target_xid = '$recovery_txid' > recovery_target_time = '$recovery_time' But recoveryTarget was set to RECOVERY_TARGET_XID Without -DEXEC_BACKEND all fine. As far as I understand the guc.c code with EXEC_BACKEND all processes uses different config processing logic. We serializeall GUC and restore at process start. And we sort GUC by name in build_guc_variables - so we restore settings inwrong order. I was afraid that the GUC system does not guarantee the order of settings... What is preferable solution? Seems we can not simple change this logic. I think i can reorganize all new recovery_target_* GUC into single one with format, for example, recovery_target = "xid:607"(was mentioned in patch discussion). regards, Sergei
On 26/11/2018 13:32, Sergei Kornilov wrote: >> Timed out while waiting for standby to catch up at >> t/003_recovery_targets.pl line 34. > > I can reproduce this and notice wrong assign settings order. For example standby_6 has >> recovery_target_name = '$recovery_name' >> recovery_target_xid = '$recovery_txid' >> recovery_target_time = '$recovery_time' > But recoveryTarget was set to RECOVERY_TARGET_XID > Without -DEXEC_BACKEND all fine. > > As far as I understand the guc.c code with EXEC_BACKEND all processes uses different config processing logic. We serializeall GUC and restore at process start. And we sort GUC by name in build_guc_variables - so we restore settings inwrong order. I was afraid that the GUC system does not guarantee the order of settings... > > What is preferable solution? Seems we can not simple change this logic. What is the reason for allowing multiple recovery_target_* settings and taking the last one? Could we change this aspect to make this behave better? > I think i can reorganize all new recovery_target_* GUC into single one with format, for example, recovery_target = "xid:607"(was mentioned in patch discussion). That would be another option. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Nov 26, 2018 at 02:06:36PM +0100, Peter Eisentraut wrote: > What is the reason for allowing multiple recovery_target_* settings and > taking the last one? This came originally to check that recovery.conf handles correctly its recovery targets when multiple ones are defined with the last one winning, as users may want to append additional targets in an existing recovery.conf. I have not checked the code in details, but I think that we should be careful to keep that property, even with EXEC_BACKEND. -- Michael
Attachment
Hi > What is the reason for allowing multiple recovery_target_* settings and > taking the last one? Could we change this aspect to make this behave > better? Correct response to user configuration, similar to old recovery.conf logic or other GUC - we allow redefine with rule "latestwin". I think we can disallow using multiple recovery_target_* settings. But currently i have no good idea how this check can bedone in check_* callbacks. regards, Sergei
Greetings, * Michael Paquier (michael@paquier.xyz) wrote: > On Mon, Nov 26, 2018 at 02:06:36PM +0100, Peter Eisentraut wrote: > > What is the reason for allowing multiple recovery_target_* settings and > > taking the last one? > > This came originally to check that recovery.conf handles correctly its > recovery targets when multiple ones are defined with the last one > winning, as users may want to append additional targets in an existing > recovery.conf. I have not checked the code in details, but I think that > we should be careful to keep that property, even with EXEC_BACKEND. Ugh, no, I don't agree that this property makes sense to keep and since we're making these changes, I'd argue that it's exactly the right time to do away with that. I would think we'd have the different GUCs and then the check functions would only validate that they're valid inputs and then when we get to the point where we're starting to do recovery we check and make sure we've been given a sane overall configuration- which means that only *one* is set, and it matches the recovery target requested. As with backup, recovery should be extremely clear and straight-forward, with alarms going off if there's something unclear or inconsistent. One of the arguments around using postgresql.auto.conf (or any other managed config file, as discussed elsewhere) is that it can be machine-processed and therefore tooling around it can make sure that what goes into that file is correct and sane and doesn't need to rely on being able to just append things to the file and hope that it works. Thanks! Stephen
Attachment
At 2018-11-26 10:18:52 -0500, sfrost@snowman.net wrote: > > > This came originally to check that recovery.conf handles correctly > > its recovery targets when multiple ones are defined with the last > > one winning […] > > Ugh, no, I don't agree that this property makes sense to keep and > since we're making these changes, I'd argue that it's exactly the > right time to do away with that. I strongly agree with everything Stephen said here. -- Abhijit
On 11/26/18 10:35 AM, Abhijit Menon-Sen wrote: > At 2018-11-26 10:18:52 -0500, sfrost@snowman.net wrote: >> >>> This came originally to check that recovery.conf handles correctly >>> its recovery targets when multiple ones are defined with the last >>> one winning […] >> >> Ugh, no, I don't agree that this property makes sense to keep and >> since we're making these changes, I'd argue that it's exactly the >> right time to do away with that. > > I strongly agree with everything Stephen said here. +1. -- -David david@pgmasters.net
Hello > I would think we'd have the different GUCs and then the check functions > would only validate that they're valid inputs and then when we get to > the point where we're starting to do recovery we check and make sure > we've been given a sane overall configuration- which means that only > *one* is set, and it matches the recovery target requested. How about something like attached patch? regards, Sergei
Attachment
Greetings, * Sergei Kornilov (sk@zsrv.org) wrote: > > I would think we'd have the different GUCs and then the check functions > > would only validate that they're valid inputs and then when we get to > > the point where we're starting to do recovery we check and make sure > > we've been given a sane overall configuration- which means that only > > *one* is set, and it matches the recovery target requested. > How about something like attached patch? I've not been following this very closely, but seems like recovery_target_string is a bad idea.. Why not just make that 'recovery_target_immediate' and make it a boolean? Having a GUC that's only got one valid value seems really odd. > + if (targetSettingsCount > 1) > + ereport(FATAL, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("can not specify multiple recovery targets"))); I think I would have done 'if (targetSettingsCount != 1)' here. > -# Multiple targets > -# Last entry has priority (note that an array respects the order of items > -# not hashes). You could (and imv should) include a test that throws an expected error if multiple targets are specified. Thanks! Stephen
Attachment
On 2018-Nov-26, Stephen Frost wrote: > I would think we'd have the different GUCs and then the check functions > would only validate that they're valid inputs and then when we get to > the point where we're starting to do recovery we check and make sure > we've been given a sane overall configuration- which means that only > *one* is set, and it matches the recovery target requested. I don't quite understand why it isn't sensible to specify more than one and just stop recovery (or whatever) when at least one of them becomes true. Maybe I want to terminate just before commit of transaction 12345, but no later than 2018-11-11 12:47 in any case. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-11-26 15:09:43 -0300, Alvaro Herrera wrote: > On 2018-Nov-26, Stephen Frost wrote: > > > I would think we'd have the different GUCs and then the check functions > > would only validate that they're valid inputs and then when we get to > > the point where we're starting to do recovery we check and make sure > > we've been given a sane overall configuration- which means that only > > *one* is set, and it matches the recovery target requested. > > I don't quite understand why it isn't sensible to specify more than one > and just stop recovery (or whatever) when at least one of them becomes > true. Maybe I want to terminate just before commit of transaction > 12345, but no later than 2018-11-11 12:47 in any case. +1
Greetings,
On Mon, Nov 26, 2018 at 13:15 Andres Freund <andres@anarazel.de> wrote:
On 2018-11-26 15:09:43 -0300, Alvaro Herrera wrote:
> On 2018-Nov-26, Stephen Frost wrote:
>
> > I would think we'd have the different GUCs and then the check functions
> > would only validate that they're valid inputs and then when we get to
> > the point where we're starting to do recovery we check and make sure
> > we've been given a sane overall configuration- which means that only
> > *one* is set, and it matches the recovery target requested.
>
> I don't quite understand why it isn't sensible to specify more than one
> and just stop recovery (or whatever) when at least one of them becomes
> true. Maybe I want to terminate just before commit of transaction
> 12345, but no later than 2018-11-11 12:47 in any case.
I really have doubts about that being a serious use-case. If you know the xid then you almost certainly want everything before that xid, and you use the time stamp when you don’t know the xid (which is pretty much always..).
+1
Well, we could start with: that isn’t how things work today, nor how it used to work before this patch, and we’ve not had anyone asking for it except for people on this thread making things up.
Let’s at least fix the currently committed patch before adding new features or changing how things in recovery work.
Thanks!
Stephen
On 2018-11-26 13:30:18 -0500, Stephen Frost wrote: > Well, we could start with: that isn’t how things work today, nor how it > used to work before this patch, and we’ve not had anyone asking for it > except for people on this thread making things up. Thanks for assuming good faith.
Hello > I think I would have done 'if (targetSettingsCount != 1)' here. Streaming replication does not have any recovery_target. Well, i can check StandbyModeRequested too and allow 0 recovery_target only for standby mode. >> -# Multiple targets >> -# Last entry has priority (note that an array respects the order of items >> -# not hashes). > > You could (and imv should) include a test that throws an expected error > if multiple targets are specified. We have some example for checking unable-to-start DB? Did not find yet > Let’s at least fix the currently committed patch before adding new features or changing how things in recovery work. +1 regards, Sergei
Greetings, * Sergei Kornilov (sk@zsrv.org) wrote: > > I think I would have done 'if (targetSettingsCount != 1)' here. > Streaming replication does not have any recovery_target. > Well, i can check StandbyModeRequested too and allow 0 recovery_target only for standby mode. Ah. Agreed. > >> -# Multiple targets > >> -# Last entry has priority (note that an array respects the order of items > >> -# not hashes). > > > > You could (and imv should) include a test that throws an expected error > > if multiple targets are specified. > We have some example for checking unable-to-start DB? Did not find yet The TAP system in general supports expected failures, but that might only be with the simple binaries, not sure. I know src/bin/pg_dump has those though. > > Let’s at least fix the currently committed patch before adding new features or changing how things in recovery work. > +1 Thanks! Stephen
Attachment
On 11/26/18 1:15 PM, Andres Freund wrote: > On 2018-11-26 15:09:43 -0300, Alvaro Herrera wrote: >> On 2018-Nov-26, Stephen Frost wrote: >> >>> I would think we'd have the different GUCs and then the check functions >>> would only validate that they're valid inputs and then when we get to >>> the point where we're starting to do recovery we check and make sure >>> we've been given a sane overall configuration- which means that only >>> *one* is set, and it matches the recovery target requested. >> >> I don't quite understand why it isn't sensible to specify more than one >> and just stop recovery (or whatever) when at least one of them becomes >> true. Maybe I want to terminate just before commit of transaction >> 12345, but no later than 2018-11-11 12:47 in any case. > > +1 -1. At least for now. Prior to this patch the last target specified in recovery.conf was the one used, not a combination of them. The committed patch did not propose to change that behavior as far as I can see. Since there is no way to determine the order of GUCs like there was for options in recovery.conf, I think it makes sense to restrict it to a single target type for now. This is not exactly the behavior we had before but I think it comes the closest. Allowing multiple targets could be considered as a separate patch if anyone is interested. Regards, -- -David david@pgmasters.net
Hello Updated patch attached: - added testcase to verify database does not start with multiple recovery targets - recovery_target = immediate was replaced with recovery_target_immediate bool GUC thank you! regards, Sergei
Attachment
Greetings, * Sergei Kornilov (sk@zsrv.org) wrote: > Updated patch attached: > - added testcase to verify database does not start with multiple recovery targets > - recovery_target = immediate was replaced with recovery_target_immediate bool GUC I'd encourage you to look through the diff after you're finished hacking before sending it to the list, in case things get left in that should be removed, as below... > diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c > index c80b14e..cd29606 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -261,13 +261,21 @@ static bool restoredFromArchive = false; > static char *replay_image_masked = NULL; > static char *master_image_masked = NULL; > > +/* recovery_target* original GUC values */ > +char *recovery_target_string; This shouldn't be here now, should it? > +char *recovery_target_xid_string; > +char *recovery_target_time_string; > +char *recovery_target_name_string; > +char *recovery_target_lsn_string; > /* options formerly taken from recovery.conf for archive recovery */ Shouldn't all of the above be listed under this comment..? > char *recoveryRestoreCommand = NULL; > char *recoveryEndCommand = NULL; > char *archiveCleanupCommand = NULL; > -RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET; > +static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET; > bool recoveryTargetInclusive = true; > int recoveryTargetAction = RECOVERY_TARGET_ACTION_PAUSE; > +bool recoveryTargetImmediate; Seems like this should, at least, be close to the char*'s that are defining the other ways to specify a recovery targer. > TransactionId recoveryTargetXid; > TimestampTz recoveryTargetTime; > char *recoveryTargetName; > @@ -5381,9 +5389,42 @@ readRecoverySignalFile(void) > static void > validateRecoveryParameters(void) > { > + uint8 targetSettingsCount = 0; > + > if (!ArchiveRecoveryRequested) > return; > > + /* Check for mutually exclusive target actions */ > + if (recoveryTargetImmediate) > + { > + ++targetSettingsCount; > + recoveryTarget = RECOVERY_TARGET_IMMEDIATE; > + } > + if (strcmp(recovery_target_name_string, "") != 0) > + { > + ++targetSettingsCount; > + recoveryTarget = RECOVERY_TARGET_NAME; > + } > + if (strcmp(recovery_target_lsn_string, "") != 0) > + { > + ++targetSettingsCount; > + recoveryTarget = RECOVERY_TARGET_LSN; > + } > + if (strcmp(recovery_target_xid_string, "") != 0) > + { > + ++targetSettingsCount; > + recoveryTarget = RECOVERY_TARGET_XID; > + } > + if (strcmp(recovery_target_time_string, "") != 0) > + { > + ++targetSettingsCount; > + recoveryTarget = RECOVERY_TARGET_TIME; > + } > + if (targetSettingsCount > 1) > + ereport(FATAL, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("can not specify multiple recovery targets"))); Doesn't look like you changed this based on my prior comment..? Thanks! Stephen
Attachment
Hi > I'd encourage you to look through the diff after you're finished hacking > before sending it to the list, in case things get left in that should be > removed, as below... I am so sorry. I have a look before sending, but... It's night in my timezone. I will fix tomorrow. >> + if (targetSettingsCount > 1) >> + ereport(FATAL, >> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), >> + errmsg("can not specify multiple recovery targets"))); > > Doesn't look like you changed this based on my prior comment..? Do you mean this one? > I think I would have done 'if (targetSettingsCount != 1)' here. To be sure, we need check we have one recovery target without standby mode and none or one in standby mode? Or some anothercheck? regards, Sergei
Greetings, * Sergei Kornilov (sk@zsrv.org) wrote: > > I'd encourage you to look through the diff after you're finished hacking > > before sending it to the list, in case things get left in that should be > > removed, as below... > I am so sorry. I have a look before sending, but... > It's night in my timezone. I will fix tomorrow. Sure, I don't think there's any need to rush any of this. > >> + if (targetSettingsCount > 1) > >> + ereport(FATAL, > >> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > >> + errmsg("can not specify multiple recovery targets"))); > > > > Doesn't look like you changed this based on my prior comment..? > Do you mean this one? > > I think I would have done 'if (targetSettingsCount != 1)' here. > To be sure, we need check we have one recovery target without standby mode and none or one in standby mode? Or some anothercheck? Right, I think that's what the idea was, although that might require something for the explicit 'recover to the end case', now that I think about it, so perhaps the >1 isn't so bad. Still seems a bit odd to me though and I do wonder if there might be a better approach. Thanks! Stephen
Attachment
On 26/11/2018 21:30, Sergei Kornilov wrote: > - recovery_target = immediate was replaced with recovery_target_immediate bool GUC Why? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello >> - recovery_target = immediate was replaced with recovery_target_immediate bool GUC > > Why? Due this comment: https://www.postgresql.org/message-id/20181126172118.GY3415%40tamriel.snowman.net > I've not been following this very closely, but seems like > recovery_target_string is a bad idea.. Why not just make that > 'recovery_target_immediate' and make it a boolean? Having a GUC that's > only got one valid value seems really odd. regards, Sergei
On 27/11/2018 10:10, Sergei Kornilov wrote: > Hello > >>> - recovery_target = immediate was replaced with recovery_target_immediate bool GUC >> >> Why? > Due this comment: https://www.postgresql.org/message-id/20181126172118.GY3415%40tamriel.snowman.net >> I've not been following this very closely, but seems like >> recovery_target_string is a bad idea.. Why not just make that >> 'recovery_target_immediate' and make it a boolean? Having a GUC that's >> only got one valid value seems really odd. It is a bit odd, but that's the way it's been, and I don't see a reason to change it as part of this fix. We are attempting to fix the way the GUC parameters are parsed, not change the name and meaning of the parameters themselves. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 27/11/2018 13:29, Peter Eisentraut wrote: > On 27/11/2018 10:10, Sergei Kornilov wrote: >> Hello >> >>>> - recovery_target = immediate was replaced with recovery_target_immediate bool GUC >>> >>> Why? >> Due this comment: https://www.postgresql.org/message-id/20181126172118.GY3415%40tamriel.snowman.net >>> I've not been following this very closely, but seems like >>> recovery_target_string is a bad idea.. Why not just make that >>> 'recovery_target_immediate' and make it a boolean? Having a GUC that's >>> only got one valid value seems really odd. > > It is a bit odd, but that's the way it's been, and I don't see a reason > to change it as part of this fix. We are attempting to fix the way the > GUC parameters are parsed, not change the name and meaning of the > parameters themselves. The attached seems to be the simplest way to fix this. (Needs documentation updates, test updates, error message refinement.) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi > The attached seems to be the simplest way to fix this. (Needs > documentation updates, test updates, error message refinement.) Thank you! I add documentation change, tests and rephrased error message. regards, Sergei
Attachment
Greetings, * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > On 27/11/2018 13:29, Peter Eisentraut wrote: > > On 27/11/2018 10:10, Sergei Kornilov wrote: > >>>> - recovery_target = immediate was replaced with recovery_target_immediate bool GUC > >>> > >>> Why? > >> Due this comment: https://www.postgresql.org/message-id/20181126172118.GY3415%40tamriel.snowman.net > >>> I've not been following this very closely, but seems like > >>> recovery_target_string is a bad idea.. Why not just make that > >>> 'recovery_target_immediate' and make it a boolean? Having a GUC that's > >>> only got one valid value seems really odd. > > > > It is a bit odd, but that's the way it's been, and I don't see a reason > > to change it as part of this fix. We are attempting to fix the way the > > GUC parameters are parsed, not change the name and meaning of the > > parameters themselves. I disagree quite a bit- we're whacking around how recovery works and this has been a wart since it went in because the contemplated later changes to have more than one value be accepted there never materialized, so we might as well change it while we're changing everything else and clean it up. If we really want to change it later we can do that, but we've evidently decided to go the opposite direction and have multiple GUCs instead, so let's be consistent. > The attached seems to be the simplest way to fix this. (Needs > documentation updates, test updates, error message refinement.) Sure, this approach seems fine to me and is perhaps a bit cleaner. Thanks! Stephen
Attachment
On 27/11/2018 15:16, Sergei Kornilov wrote: > Hi > >> The attached seems to be the simplest way to fix this. (Needs >> documentation updates, test updates, error message refinement.) > > Thank you! > I add documentation change, tests and rephrased error message. I have committed it along these lines. It ended up being a bit more involved, in order to support setting the same parameter multiple times or unsetting one parameter and setting another. But it should pass now on EXEC_BACKEND. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services