Thread: Continue work on changes to recovery.conf API
Hello all I would like to continue work on new recovery api proposed in thread [1]. We have some form of consensus but thread has beeninactive for a long time and i hope i can help. I start from last published patch [2] and make some changes: - updated to current HEAD - made the patch pass make check-world run - removed recovery.auto.conf lookup and changed pg_basebackup to append recovery parameters to postgresql.auto.conf in bothplain and tar formats - revert back trigger_file logic, but rename to promote_signal_file. I think here is no reason to make GUC only for directorypath with hardcoded file name as was discussed in original thread. Any feedback is strongly appreciated. Thank you A brief summary of proposed changes: - if recovery.conf present during start we report FATAL error about old style config - recovery mode start if exists file "recovery.signal" in datadir, all old recovery_target_* options are replaced by recovery_target_typeand recovery_target_value GUC - start standby mode - by file "standby.signal" - if present both - standby wins - pg_basebackup -R will append primary_conninfo and primary_slot_name to postgresql.auto.conf config - all new GUC are still PGC_POSTMASTER PS: i will add an entry to 2018-09 CommitFest when it is become available. regards, Sergei [1] https://www.postgresql.org/message-id/flat/CANP8%2BjLO5fmfudbB1b1iw3pTdOK1HBM%3DxMTaRfOa5zpDVcqzew%40mail.gmail.com [2] https://www.postgresql.org/message-id/CANP8+jJo-LO4xtS7G=iN7PG5o60WeWdKEAn+X+Gnf+RNay5jGQ@mail.gmail.com
I completely forgot attach patch, sorry. Attached now
Attachment
On 22 June 2018 at 02:48, Sergei Kornilov <sk@zsrv.org> wrote:
I completely forgot attach patch, sorry. Attached now
Sergei,
It's great to see you picking this up. I'll try to find time to review this for the next CF cycle. Please poke me if you don't hear from me, I do get distracted. It's long past time to get these changes in for good.
Hello Commitfest 2018-09 is now open and, as planned, i create one entry: https://commitfest.postgresql.org/19/1711/ Also i attach new version due merge conflict with HEAD. regards, Sergei
Attachment
On 01/07/2018 13:45, Sergei Kornilov wrote: > Commitfest 2018-09 is now open and, as planned, i create one entry: https://commitfest.postgresql.org/19/1711/ > Also i attach new version due merge conflict with HEAD. Could you please describe in detail what this current patch does? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Current patch moves recovery.conf settings into GUC system: - if startup process found recovery.conf - it throw error - recovery mode is turned on if new special file recovery.signal found - standby_mode setting was removed. Standby mode can be enabled if startup found new special file standby.signal - if present both standby.signal and recovery.signal - we use standby mode (this design from previous thread) - recovery parameters recovery_target_inclusive, recovery_target_timeline, recovery_target_action are new GUC without logicchanges - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn are replacedto recovery_target_type and recovery_target_value (was discussed and changed in previous thread) - restore_command, archive_cleanup_command, recovery_end_command, primary_conninfo, primary_slot_name and recovery_min_apply_delayare just new GUC - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed) - all new GUC are PGC_POSTMASTER (discussed in prev thread) - pg_basebackup with -R (--write-recovery-conf) option will create standby.signal file and append primary_conninfo and primary_slot_name(if was used) to postgresql.auto.conf instead of writing new recovery.conf - appropriate changes in tests and docs regards, Sergei
Hello Attached small update due the merge conflict with HEAD docs. regards, Sergei
Attachment
On 29/08/2018 17:43, Sergei Kornilov wrote: > Current patch moves recovery.conf settings into GUC system: > - if startup process found recovery.conf - it throw error OK > - recovery mode is turned on if new special file recovery.signal found OK > - standby_mode setting was removed. Standby mode can be enabled if startup found new special file standby.signal OK > - if present both standby.signal and recovery.signal - we use standby mode > (this design from previous thread) Didn't find the discussion on that and don't understand the reasons, but seems OK and easy to change if we don't like it. > - recovery parameters recovery_target_inclusive, recovery_target_timeline, recovery_target_action are new GUC without logicchanges OK > - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn arereplaced to recovery_target_type and recovery_target_value (was discussed and changed in previous thread) I think this was the major point of contention. I reread the old thread, and it's still not clear why we need to change this. _type and _value look like an EAV system to me. GUC variables should be verifiable independent of another variable. The other idea of using only one variable seems better, but using two variables seems like a poor compromise between one and five. I propose to move this patch forward, keep the settings as they are. It can be another patch to rename or reshuffle them. > - restore_command, archive_cleanup_command, recovery_end_command, primary_conninfo, primary_slot_name and recovery_min_apply_delayare just new GUC OK > - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed) OK to add the "promote" prefix, but I'd prefer to keep the "trigger" name. There is a lot of discussion and knowledge around that. Seems unnecessary to change. > - all new GUC are PGC_POSTMASTER (discussed in prev thread) OK > - pg_basebackup with -R (--write-recovery-conf) option will create standby.signal file and append primary_conninfo andprimary_slot_name (if was used) to postgresql.auto.conf instead of writing new recovery.conf I wonder if that would cause any problems. The settings in postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you couldn't use ALTER SYSTEM to put them there. Maybe it's OK. > - appropriate changes in tests and docs looks good -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > I think this was the major point of contention. I reread the old > thread, and it's still not clear why we need to change this. _type and > _value look like an EAV system to me. GUC variables should be > verifiable independent of another variable. No, they MUST be independently verifiable. The interactions between the check_xxx functions in this patch are utterly unsafe. We've learned that lesson before. > I propose to move this patch forward, keep the settings as they are. It > can be another patch to rename or reshuffle them. Please do not commit this in this state. > I wonder if that would cause any problems. The settings in > postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you > couldn't use ALTER SYSTEM to put them there. Maybe it's OK. Actually, that works fine. You do have to restart the postmaster to make them take effect, but it's the same as if you edited the main config file by hand. regards, tom lane
Hi, On 2018-09-28 16:36:35 -0400, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > > I think this was the major point of contention. I reread the old > > thread, and it's still not clear why we need to change this. _type and > > _value look like an EAV system to me. GUC variables should be > > verifiable independent of another variable. > > No, they MUST be independently verifiable. The interactions between > the check_xxx functions in this patch are utterly unsafe. We've > learned that lesson before. I'm not sure those concerns apply quite the same way here - we can move the interdependent verification to the the point where they're used first rather than relying on guc.c infrastructure. We already have plenty of checks interdependent that way, without it causing many problems. UI wise that's not too bad, if they're things that cannot be changed arbitrarily at runtime. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-09-28 16:36:35 -0400, Tom Lane wrote: >> No, they MUST be independently verifiable. The interactions between >> the check_xxx functions in this patch are utterly unsafe. We've >> learned that lesson before. > I'm not sure those concerns apply quite the same way here - we can move > the interdependent verification to the the point where they're used > first rather than relying on guc.c infrastructure. And, if they're bad, what happens? Recovery fails? I don't think it's a great idea to lose out on whatever error checking the existing GUC infrastructure can provide, just so as to use a GUC design that's not very nice in the first place. regards, tom lane
Hello thank you for review! >> - if present both standby.signal and recovery.signal - we use standby mode >> (this design from previous thread) > > Didn't find the discussion on that and don't understand the reasons, but > seems OK and easy to change if we don't like it. I meant this was implemented in previous proposed patch and i do not changed this. I didn't find the discussion on that too... >> - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn arereplaced to recovery_target_type and recovery_target_value (was discussed and changed in previous thread) > > I think this was the major point of contention. I reread the old > thread, and it's still not clear why we need to change this. _type and > _value look like an EAV system to me. GUC variables should be > verifiable independent of another variable. The other idea of using > only one variable seems better, but using two variables seems like a > poor compromise between one and five. > No, they MUST be independently verifiable. The interactions between the check_xxx functions in this patch are utterly unsafe. Understood, thank you. I will implement set of current five recovery_target* settings and would like to leave reorganization for futher pathes. Not sure about one thing. All recovery_target settings change one internal recoveryTarget variable. GUC infrastructure willguarantee behavior if user erroneously set multiple different recovery_target*? I mean check_* and assign_* will be calledin order and so latest config line wins? >> - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed) > > OK to add the "promote" prefix, but I'd prefer to keep the "trigger" > name. There is a lot of discussion and knowledge around that. Seems > unnecessary to change. Ok, will change to promote_trigger_file >> - pg_basebackup with -R (--write-recovery-conf) option will create standby.signal file and append primary_conninfo andprimary_slot_name (if was used) to postgresql.auto.conf instead of writing new recovery.conf > > I wonder if that would cause any problems. The settings in > postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you > couldn't use ALTER SYSTEM to put them there. Maybe it's OK. Actually we can use ALTER SYSTEM to put PGC_POSTMASTER settings. I tried now "alter system set max_connections to 300;",postgresql.auto.conf was changed and affect max_connections during database start. Of course, we can not reload PGC_POSTMASTER settings, but during start we call regular ParseConfigFile and can override PGC_POSTMASTER. regards, Sergei
Hello I complete new version of this patch. I've attached it. In this version i remove proposed recovery_target_type/recovery_target_value and implement set of current settings: recovery_target(immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn >>> - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed) >> >> OK to add the "promote" prefix, but I'd prefer to keep the "trigger" >> name. There is a lot of discussion and knowledge around that. Seems >> unnecessary to change. > > Ok, will change to promote_trigger_file Renamed to promote_trigger_file Possible we need something like separate xlog_guc.c and header for related functions and definition? regards, Sergei
Attachment
Hi I attached new version of this patch due merge conflict with pg_promote function. regards, Sergei
Attachment
On Fri, Sep 28, 2018 at 4:20 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn arereplaced to recovery_target_type and recovery_target_value (was discussed and changed in previous thread) > > I think this was the major point of contention. I reread the old > thread, and it's still not clear why we need to change this. _type and > _value look like an EAV system to me. GUC variables should be > verifiable independent of another variable. The other idea of using > only one variable seems better, but using two variables seems like a > poor compromise between one and five. +1. I like one best, I can live with five, and I think two stinks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 30/10/2018 14:30, Sergei Kornilov wrote: > I attached new version of this patch due merge conflict with pg_promote function. This patch looks pretty good to me functionality-wise. There are some code details to work through, listed below. In this review, I'm skipping over your documentation changes. It seems you have found all the places that mention recovery.conf and updated them adequately. But I think in some cases we will need to take a few steps back and reword a paragraph or section altogether. For example, there will no longer be a reason for recovery-config.sgml to be a separate chapter if it's all part of the main GUC system. We can work through that later. Code discussion: - useless whitespace change in xlog.c - I think we can drop logRecoveryParameters(). Users can now inspect those parameters using the normal GUC mechanisms. (They were all DEBUG2 anyway, so it's not like many users will be missing this.) Then you can also drop RecoveryTargetActionText(). - Why introduce MAXRESTOREPOINTNAMELEN? If you think this is useful, then we could do it as a separate patch. - Make the help text change in pg_archivecleanup.c similar to pg_standby.c. - In pg_basebackup.c, duplicate defines PG_AUTOCONF_FILENAME and STANDBY_SIGNAL_FILE. See that you can put those into a header file somewhere. - This stuff breaks translation strings: printf(_(" -R, --write-recovery-conf\n" - " write recovery.conf for replication\n")); + " append replication config to " PG_AUTOCONF_FILENAME "\n" + " and place " STANDBY_SIGNAL_FILE " file\n")); Use %s placeholders, or better yet replace it in a more compact way. - The test function $node_standby->request_standby_mode() could use a better name. How about set_ instead of request_ ? - New string GUCs should have "" instead of NULL as their default in guc.c. (Not sure whether that is strictly necessary, but it seems good to be consistent.) - Help strings in guc.c should end with a period. - Renaming the promote and fallback_promote files seems unnecessary for this patch, and I would take it out. Otherwise, the change in pg_ctl.c is out of date with the comment above it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Thank you! Here is patch update addressing your comments. > - useless whitespace change in xlog.c Oops, did not notice. Fixed. > - I think we can drop logRecoveryParameters(). Users can now inspect > those parameters using the normal GUC mechanisms. (They were all DEBUG2 > anyway, so it's not like many users will be missing this.) Then you can > also drop RecoveryTargetActionText(). Agreed, done. > - Why introduce MAXRESTOREPOINTNAMELEN? If you think this is useful, > then we could do it as a separate patch. Reverted back. This was changed in prev proposal. > - Make the help text change in pg_archivecleanup.c similar to pg_standby.c. Changed. > - In pg_basebackup.c, duplicate defines PG_AUTOCONF_FILENAME and > STANDBY_SIGNAL_FILE. See that you can put those into a header file > somewhere. I move constants from xlog.h to xlog_internal.h. Also i revert back RECOVERY_COMMAND_FILE and RECOVERY_COMMAND_DONE intoxlog.c (was moved to xlog.h few weeks ago) But i have no good idea for PG_AUTOCONF_FILENAME. Seems most src/bin/ application uses hardcoded file path. How about miscadmin.h? > - This stuff breaks translation strings: > > printf(_(" -R, --write-recovery-conf\n" > - " write recovery.conf for > replication\n")); > + " append replication config to " > PG_AUTOCONF_FILENAME "\n" > + " and place " STANDBY_SIGNAL_FILE " > file\n")); > > Use %s placeholders, or better yet replace it in a more compact way. Maybe leave just "write configuration for replication" with explanation in docs, as was before? > - The test function $node_standby->request_standby_mode() could use a > better name. How about set_ instead of request_ ? Sounds good, changed. > - New string GUCs should have "" instead of NULL as their default in > guc.c. (Not sure whether that is strictly necessary, but it seems good > to be consistent.) > - Help strings in guc.c should end with a period. Fixed > - Renaming the promote and fallback_promote files seems unnecessary for > this patch, and I would take it out. Otherwise, the change in pg_ctl.c > is out of date with the comment above it. Agreed, revert back. regards, Sergei
Attachment
On 13/11/2018 15:57, Sergei Kornilov wrote: > Thank you! Here is patch update addressing your comments. I went over the patch and did a bunch of small refinements. I have also updated the documentation more extensively. The attached patch is committable to me. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Nov 20, 2018 at 12:07:12PM +0100, Peter Eisentraut wrote: > I went over the patch and did a bunch of small refinements. I have also > updated the documentation more extensively. The attached patch is > committable to me. Thanks for putting energy into that. > Recovery is now initiated by a file recovery.signal. Standby mode is > initiated by a file standby.signal. The standby_mode setting is > gone. If a recovery.conf file is found, an error is issued. I am having second thoughts about this part of the patch actually. What's bad in keeping standby_mode and just rely on recovery.signal to enforce recovery to happen? When the startup process starts all the parameters should be loaded. That would also need less work from users to switch to the new APIs. I think that there could be a point to *merge* hot_standby and standby_mode actually into an enum, so keeping standby_mode would help with that (not this patch work of course). The barrier between recovery.trigger standby.trigger is also rather thin. > The trigger_file setting has been renamed to promote_trigger_file as > part of the move. This rename looks fine. > pg_basebackup -R now appends settings to postgresql.auto.conf and > creates a standby.signal file. > > Author: Simon Riggs <simon@2ndquadrant.com> > Author: Abhijit Menon-Sen <ams@2ndquadrant.com> > Author: Sergei Kornilov <sk@zsrv.org> I think that Fujii Masao should also be listed as an author, or at least get a mention. He was one of the first to work on this stuff. Using separate GUCs for each target is fine by me. I would have thought that 003_recovery_targets.pl needed some tweaks, so I am happy to see that it is not the case. So overall this stuff looks fine per its shape, just the part for standby.signal may not justify breaking compatibility more than necessary... The first mention of this part comes from https://postgr.es/m/CANP8+jLoYSDjB5ip7wynVcckoE4OynHabUnB8pQMgBJgFKQpiw@mail.gmail.com as far as I know. -- Michael
Attachment
On 21/11/2018 07:00, Michael Paquier wrote: > What's bad in keeping standby_mode and just rely on recovery.signal to > enforce recovery to happen? When the startup process starts all the > parameters should be loaded. That would also need less work from users > to switch to the new APIs. I think that there could be a point to > *merge* hot_standby and standby_mode actually into an enum, so keeping > standby_mode would help with that (not this patch work of course). The > barrier between recovery.trigger standby.trigger is also rather thin. This wasn't my idea, so this is just my interpretation. The scenario I'm wondering about is: You have a standby. So (under your system) you set standby_mode=on and create recovery.trigger. Then you promote that standby, so recovery.trigger is removed, but standby_mode=on stays. Much time passes. At some point you want to do a PITR on that instance. So you create a recovery.trigger, set some recovery parameters. But you didn't notice that standby_mode=on is still set from way back when -- and you create a mess. One way to think about it is: Being a standby is a state, not a configuration setting. Btw., I'm not in love with the *.signal naming. I originally argued against naming them *.trigger, but I don't like the alternative either. But that's easy to change if someone has a better idea. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Nov 21, 2018 at 12:58:19PM +0100, Peter Eisentraut wrote: > This wasn't my idea, so this is just my interpretation. The scenario > I'm wondering about is: You have a standby. So (under your system) you > set standby_mode=on and create recovery.trigger. Then you promote that > standby, so recovery.trigger is removed, but standby_mode=on stays. > Much time passes. At some point you want to do a PITR on that instance. > So you create a recovery.trigger, set some recovery parameters. But > you didn't notice that standby_mode=on is still set from way back when > -- and you create a mess. > > One way to think about it is: Being a standby is a state, not a > configuration setting. Very good point, I have not thought of this problem from this perspective. Indeed that can be confusing. Now things get reset once recovery.conf is renamed. > Btw., I'm not in love with the *.signal naming. I originally argued > against naming them *.trigger, but I don't like the alternative either. > But that's easy to change if someone has a better idea. Using "recovery" or "signal" would be more consistent with the existing "promote" but that does not feel good either. "trigger" does not sound that bad... One extra thing I was wondering is if if would make sense to rename the signal (or trigger files) to .done once recovery is over. That's useful for debugging. That could always be refined later.. -- Michael
Attachment
On 21/11/2018 07:00, Michael Paquier wrote: >> Author: Simon Riggs <simon@2ndquadrant.com> >> Author: Abhijit Menon-Sen <ams@2ndquadrant.com> >> Author: Sergei Kornilov <sk@zsrv.org> > I think that Fujii Masao should also be listed as an author, or at least > get a mention. He was one of the first to work on this stuff. Committed with that addition. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings, * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > Committed with that addition. I'm concerned that this hasn't been entirely thought through regarding pretty common use-cases, consider a pretty typical pg_basebackup -R usage case: - User performs a backup with pg_basebackup -R - Replica is then promoted to be a primary - User performs a backup with pg_basebackup -R on the new primary - Duplicate entries end up in postgresql.auto.conf. Ew. Now thinking about it from the perspective of... more complete backup solutions, which give the user the option to fill out more of what used to go into recovery.conf, it seems like there's going to have to be a lot more hacking around with postgresql.auto.conf, such as adding in recovery target time and other parameters, which I'm not terribly thrilled with. For example, when a backup is done, typically postgresql.auto.conf comes along with it and gets restored with it, but if that's done now without mucking with the contents, we'll run into the same issue that pg_basebackup has as outlined above- recovery target time and other parameters being included in the postgresql.auto.conf, possibly multiple times if care isn't taken to avoid that. While it used to be that you could end up with recovery.done files ending up in backups, they were just a nuisance and didn't impact anything. One possible approach for dealing with at least some of these concerns would be to remove the recovery settings from postgresql.auto.conf once recovery is done, which is similar to how we moved recovery.conf to recovery.done, at least. We could even create a recovery.done file if we wanted to. Unfortunately, this also means that users are going to have their own issues when just using pg_basebackup and trying to perform PITR since they'll be modifying postgresql.conf and will have to remember to remove those settings since we aren't going to modify that. Maybe we should have a warning or notice or something saying "hey, now that this has been promoted, maybe you should check your config file and clear out the recovery settings?". Thinking about how this would work with something like pgbackrest dropping settings into postgresql.auto.conf, users would need to be educated to use ALTER SYSTEM if they want to move the recovery target time to be later while doing PITR (to avoid having them hacking around in postgresql.auto.conf), which is perhaps not terrible. What would make that experience nicer would be a way to restart PG remotely after making that change (well, or just making it so that users could tell PG to continue to play forward, but as I understand it this patch doesn't give us that). These are my thoughts on this, at least at the moment. I've also asked David Steele to comment, of course, since this will impact pgbackrest. In the end, I'm not entirely convinced that eliminating recovery.conf is actually an improvement; I think I would have rather seen the original approach of having an 'auto' recovery.conf, but perhaps we can improve on this since others seemed anxious to get rid of recovery.conf (though I'm not sure why- seems like we'll likely end up with more code to handle things cleanly with a merged recovery.auto/postgresql.auto than if we had kept them independent). Thanks! Stephen
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > Committed with that addition. Some of the buildfarm doesn't like this. I realized that the common denominator was EXEC_BACKEND: if you build with -DEXEC_BACKEND the failure is completely reproducible. Presumably something is thinking that it will inherit some variable from the postmaster via fork. I don't know if that's a new bug or was just exposed by a new test. regards, tom lane
Hi, On 2018-11-25 13:24:15 -0500, Stephen Frost wrote: > - User performs a backup with pg_basebackup -R > - Replica is then promoted to be a primary > - User performs a backup with pg_basebackup -R on the new primary > - Duplicate entries end up in postgresql.auto.conf. Ew. Why don't we put recovery entries into postgresql.recovery.conf or such? And overwrite rather than append? > In the end, I'm not entirely convinced that eliminating recovery.conf is > actually an improvement; I think I would have rather seen the original > approach of having an 'auto' recovery.conf, but perhaps we can improve > on this since others seemed anxious to get rid of recovery.conf (though > I'm not sure why- seems like we'll likely end up with more code to > handle things cleanly with a merged recovery.auto/postgresql.auto than > if we had kept them independent). If we ever want to have more dynamic reconfiguration around recovery (say changing the primary and other settings at runtime, and a lot of more advanced features), we're going to need infrastructure to deal with that. Without the merge we'd have to duplicate the guc logic. Greetings, Andres Freund
On 2018-11-25 16:56:13 +0100, Peter Eisentraut wrote: > On 21/11/2018 07:00, Michael Paquier wrote: > >> Author: Simon Riggs <simon@2ndquadrant.com> > >> Author: Abhijit Menon-Sen <ams@2ndquadrant.com> > >> Author: Sergei Kornilov <sk@zsrv.org> > > I think that Fujii Masao should also be listed as an author, or at least > > get a mention. He was one of the first to work on this stuff. > > Committed with that addition. Wee! Greetings, Andres Freund
Greetings,
On Sun, Nov 25, 2018 at 15:39 Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
> - User performs a backup with pg_basebackup -R
> - Replica is then promoted to be a primary
> - User performs a backup with pg_basebackup -R on the new primary
> - Duplicate entries end up in postgresql.auto.conf. Ew.
Why don't we put recovery entries into postgresql.recovery.conf or such?
And overwrite rather than append?
Sure, I think that’s more or less the same thing..? Another included file, but one specifically for recovery bits.
> In the end, I'm not entirely convinced that eliminating recovery.conf is
> actually an improvement; I think I would have rather seen the original
> approach of having an 'auto' recovery.conf, but perhaps we can improve
> on this since others seemed anxious to get rid of recovery.conf (though
> I'm not sure why- seems like we'll likely end up with more code to
> handle things cleanly with a merged recovery.auto/postgresql.auto than
> if we had kept them independent).
If we ever want to have more dynamic reconfiguration around recovery
(say changing the primary and other settings at runtime, and a lot of
more advanced features), we're going to need infrastructure to deal with
that. Without the merge we'd have to duplicate the guc logic.
The recovery auto conf before was also just included into the config, avoiding the need to have specialized handling. I do see the advantage of having it utilize the regular GUC system, but I don’t like losing the separation we had which allowed us to just move the whole recovery.conf to recovery.done when we finish recovery. Seems like we could have both though if we have the recovery options in a separately included file instead of in PostgreSQL.auto.conf.
Thanks!
Stephen
Hi > Why don't we put recovery entries into postgresql.recovery.conf or such? > And overwrite rather than append? Appending to postgressql.auto.conf instead of writing new hardcoded file was proposed here: https://www.postgresql.org/message-id/CAB7nPqQ-Grduc-0WSsK4-VXo%2B1CB%3DyDS9NMU0NEp%2Bc%2Bffnz0UQ%40mail.gmail.com regards, Sergei
Hi, On 25/11/2018 21:48, Stephen Frost wrote: > Greetings, > > On Sun, Nov 25, 2018 at 15:39 Andres Freund <andres@anarazel.de > <mailto:andres@anarazel.de>> wrote: > > Hi, > > On 2018-11-25 13:24:15 -0500, Stephen Frost wrote: > > - User performs a backup with pg_basebackup -R > > - Replica is then promoted to be a primary > > - User performs a backup with pg_basebackup -R on the new primary > > - Duplicate entries end up in postgresql.auto.conf. Ew. > > Why don't we put recovery entries into postgresql.recovery.conf or such? > And overwrite rather than append? > > > Sure, I think that’s more or less the same thing..? Another included > file, but one specifically for recovery bits. I think the important part there is overwrite rather than append. Given that postgresql.auto.conf is used by ALTER SYSTEM, doing overwrites there is not as straightforward proposition (behaviorally) as doing it in another included file. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Greetings, * Sergei Kornilov (sk@zsrv.org) wrote: > > Why don't we put recovery entries into postgresql.recovery.conf or such? > > And overwrite rather than append? > Appending to postgressql.auto.conf instead of writing new hardcoded file was proposed here: https://www.postgresql.org/message-id/CAB7nPqQ-Grduc-0WSsK4-VXo%2B1CB%3DyDS9NMU0NEp%2Bc%2Bffnz0UQ%40mail.gmail.com Thanks, yes, I saw that but I don't think it really contemplated the issues that arise from that approach, as I outlined. Thanks again! Stephen
Attachment
On Sun, Nov 25, 2018 at 04:56:13PM +0100, Peter Eisentraut wrote: > Committed with that addition. Thanks for adding that! -- Michael
Attachment
On 25/11/2018 21:39, Andres Freund wrote: > On 2018-11-25 13:24:15 -0500, Stephen Frost wrote: >> - User performs a backup with pg_basebackup -R >> - Replica is then promoted to be a primary >> - User performs a backup with pg_basebackup -R on the new primary >> - Duplicate entries end up in postgresql.auto.conf. Ew. > Why don't we put recovery entries into postgresql.recovery.conf or such? > And overwrite rather than append? Adding more such sub-configuration files would be easy to do and has some merit, but the devil is in the details. In what order would those files be read? Who is supposed to write to it, is it reserved for pg_basebackup use only? If you choose to use this particular name, is it not used when not in recovery? Does this file belong in the data directory or the configuration directory? Which choice will offend packagers the least? Etc. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11/27/18 3:59 AM, Peter Eisentraut wrote: > On 25/11/2018 21:39, Andres Freund wrote: >> On 2018-11-25 13:24:15 -0500, Stephen Frost wrote: >>> - User performs a backup with pg_basebackup -R >>> - Replica is then promoted to be a primary >>> - User performs a backup with pg_basebackup -R on the new primary >>> - Duplicate entries end up in postgresql.auto.conf. Ew. >> Why don't we put recovery entries into postgresql.recovery.conf or such? >> And overwrite rather than append? > > Adding more such sub-configuration files would be easy to do and has > some merit, but the devil is in the details. In what order would those > files be read? Who is supposed to write to it, is it reserved for > pg_basebackup use only? If you choose to use this particular name, is > it not used when not in recovery? Does this file belong in the data > directory or the configuration directory? Which choice will offend > packagers the least? Etc. I would prefer a specific file that will be auto-included into postgresql.conf when present but be ignored when not present. Some settings are generally ephemeral (recovery_target_time) and it would be nice for them to go away. When recovery is complete the file would be renamed to .done just as recovery.conf is now. I had been thinking about keeping the recovery.conf file name but on reflection is seems best to make a clean break. I like Andres' suggestion of postgresql.recovery.conf. I would not be in favor of this file being for pg_basebackup only. If it has documented and consistent behavior then any restore solution should be able to use it. To be clear, the aim is to give tools a place to write recovery GUCs that are considered temporary and/or replaceable. Users are still free to write permanent recovery GUCs into whatever file they would like, e.g. restore_command. Regards, -- -David david@pgmasters.net
On 27/11/2018 13:21, David Steele wrote: > I would prefer a specific file that will be auto-included into > postgresql.conf when present but be ignored when not present. Some > settings are generally ephemeral (recovery_target_time) and it would be > nice for them to go away. When recovery is complete the file would be > renamed to .done just as recovery.conf is now. That might be a useful facility, but it wouldn't really address the pg_basebackup -R issue, because that creates settings that you don't want going away in this manner. You'd then need two separate such files, one for targeted recovery that goes away when recovery ends, and one that is automatically included that pg_basebackup can overwrite at will. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings, * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > On 27/11/2018 13:21, David Steele wrote: > > I would prefer a specific file that will be auto-included into > > postgresql.conf when present but be ignored when not present. Some > > settings are generally ephemeral (recovery_target_time) and it would be > > nice for them to go away. When recovery is complete the file would be > > renamed to .done just as recovery.conf is now. > > That might be a useful facility, but it wouldn't really address the > pg_basebackup -R issue, because that creates settings that you don't > want going away in this manner. You'd then need two separate such > files, one for targeted recovery that goes away when recovery ends, and > one that is automatically included that pg_basebackup can overwrite at will. I've been thinking about this also and I agree that there's some challenges when it comes to having another file- what happens if someone does an ALTER SYSTEM on primary_conninfo while it's in the 'recovery.auto.conf' (or whatever) file? Does that go into postgresql.auto.conf, or somewhere else? What if they do a RESET? Then there's the other side of things- do we really want things like recovery_target_time being kept around in postgresql.auto.conf after a promotoion? Do we want to keep appending primary_conninfo into postgresql.auto.conf? I haven't looked but I'm also concerned that something like ALTER SYSTEM RESET isn't really prepared to find duplicates in postgresql.auto.conf... Maybe thinking through what we want to have happen in each scenario would be good. If you perform a pg_basebackup -R and there's already something set in postgresql.auto.conf for primary conninfo- what should happen? If we reach the end of recovery and promote while postgresql.auto.conf has recovery_target_time set, what should happen? If third-party tools want to do what pg_basebackup -R does and modify things in postgresql.auto.conf, how should they do that? Here's my thoughts on answers to the above: - pg_basebackup -R should INSERT .. ON CONFLICT UPDATE the settings that it wants to set in postgresql.auto.conf - When we reach the end of recovery and promote, we should DELETE from the postgresql.auto.conf the recovery target settings. - External tools should either be told that they can modify postgresql.auto.conf and given guideance on how to do so, or we should provide a tool which allows them to do so (or maybe both). As we already have a section for recovery target settings that clearly has them as independent, hopefully this will make sense to users. Open to other ideas too, of course, but I don't think we can continue to just append things to the end of postgresql.auto.conf when pg_basebackup is run with -R. Thanks! Stephen
Attachment
Hi, On 2018-11-27 15:36:59 +0100, Peter Eisentraut wrote: > That might be a useful facility, but it wouldn't really address the > pg_basebackup -R issue, because that creates settings that you don't > want going away in this manner. Why is that / which are those? It's not like it worked like that < 2dedf4d9. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-11-27 15:36:59 +0100, Peter Eisentraut wrote: > > That might be a useful facility, but it wouldn't really address the > > pg_basebackup -R issue, because that creates settings that you don't > > want going away in this manner. > > Why is that / which are those? It's not like it worked like that < > 2dedf4d9. primary_conninfo and restore_command are the main ones, and the idea is that you'd like those to be specified even when you aren't in recovery, so that you can have a symmetric config for when a given system does become a replica. Thanks! Stephen
Attachment
On 11/27/18 4:36 PM, Peter Eisentraut wrote: > On 27/11/2018 13:21, David Steele wrote: >> I would prefer a specific file that will be auto-included into >> postgresql.conf when present but be ignored when not present. Some >> settings are generally ephemeral (recovery_target_time) and it would be >> nice for them to go away. When recovery is complete the file would be >> renamed to .done just as recovery.conf is now. > > That might be a useful facility, but it wouldn't really address the > pg_basebackup -R issue, because that creates settings that you don't > want going away in this manner. You'd then need two separate such > files, one for targeted recovery that goes away when recovery ends, and > one that is automatically included that pg_basebackup can overwrite at will. I'm not sure why that's true. Some settings you might prefer to be persistent and others not. If pg_basebackup -R is treating them the same then that seems like a limitation. Endlessly appending settings into postgresql.auto.conf seems confusing even if it makes sense to postgres (i.e. last setting wins). Regards, -- -David david@pgmasters.net
Hi Peter, On 11/27/18 5:34 PM, Stephen Frost wrote: > Greetings, > > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: >> On 27/11/2018 13:21, David Steele wrote: >>> I would prefer a specific file that will be auto-included into >>> postgresql.conf when present but be ignored when not present. Some >>> settings are generally ephemeral (recovery_target_time) and it would be >>> nice for them to go away. When recovery is complete the file would be >>> renamed to .done just as recovery.conf is now. >> >> That might be a useful facility, but it wouldn't really address the >> pg_basebackup -R issue, because that creates settings that you don't >> want going away in this manner. You'd then need two separate such >> files, one for targeted recovery that goes away when recovery ends, and >> one that is automatically included that pg_basebackup can overwrite at will. > > I've been thinking about this also and I agree that there's some > challenges when it comes to having another file- what happens if someone > does an ALTER SYSTEM on primary_conninfo while it's in the > 'recovery.auto.conf' (or whatever) file? Does that go into > postgresql.auto.conf, or somewhere else? What if they do a RESET? > > Then there's the other side of things- do we really want things like > recovery_target_time being kept around in postgresql.auto.conf after a > promotoion? Do we want to keep appending primary_conninfo into > postgresql.auto.conf? I haven't looked but I'm also concerned that > something like ALTER SYSTEM RESET isn't really prepared to find > duplicates in postgresql.auto.conf... > > Maybe thinking through what we want to have happen in each scenario > would be good. If you perform a pg_basebackup -R and there's already > something set in postgresql.auto.conf for primary conninfo- what should > happen? If we reach the end of recovery and promote while > postgresql.auto.conf has recovery_target_time set, what should happen? > If third-party tools want to do what pg_basebackup -R does and modify > things in postgresql.auto.conf, how should they do that? > > Here's my thoughts on answers to the above: > > - pg_basebackup -R should INSERT .. ON CONFLICT UPDATE the settings that > it wants to set in postgresql.auto.conf > > - When we reach the end of recovery and promote, we should DELETE from > the postgresql.auto.conf the recovery target settings. > > - External tools should either be told that they can modify > postgresql.auto.conf and given guideance on how to do so, or we should > provide a tool which allows them to do so (or maybe both). > > As we already have a section for recovery target settings that clearly > has them as independent, hopefully this will make sense to users. Open > to other ideas too, of course, but I don't think we can continue to just > append things to the end of postgresql.auto.conf when pg_basebackup is > run with -R. I'd be interested to get your take to these questions. Just about every third-party backup and HA tool out there writes recovery.conf files and automates restores. This is a huge change. I personally would prefer to have something like postgresql.recovery.conf file that is automatically included if it is present. This simplifies the issue of how to maintain recovery settings in postgresql.auto.conf. The file could be renamed to postgresql.recovery.conf.done similar to how recovery.conf is now. Of course, some settings like primary_conninfo could just stay in postgresql.conf or postgresql.auto.conf since they are far less subject to change. Or they could be in postgresql.recovery.conf for HA environments. I don't see why this won't work with pg_basebackup -R -- it just may be the case that some settings get overridden. In HA scenarios it's hard to see how pg_basebackup would have the correct info for something like primary_conninfo anyway. I like the flexibility of recovery options being available as GUCs but I'm not sure the ramifications of these changes have been completely thought through. Regards, -- -David david@pgmasters.net