Thread: PATCH: add "--config-file=" option to pg_rewind
Hi! During a Patroni PR discussion (https://github.com/zalando/patroni/pull/2225), we realised that if one wants to use the "-c" option on a typical Debian/Ubuntu installation (where the config resides below /etc/postgresql/), pg_rewind needs a way to be told where the postgresql.conf actually is. The attached patch implements this as an option. I'm extremely unhappy that I called it "--config-file" here, while "postgres" itself wants "--config_file". But as the other dashed options of pg_rewind are, well, dashed and not underscored, it seemed to be better that the other way... As the "-c" feature appeared in 12 and existing Debian installations are unable to use it right now, I suggest to even backport the patch. Oh, and I'm still not subscribed to -hackers, so the usual "please CC me" applies! Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
Attachment
On Wed, 2022-02-23 at 16:28 +0100, Gunnar "Nick" Bluth wrote: > I'm extremely unhappy that I called it "--config-file" here, while > "postgres" itself wants "--config_file". But as the other dashed options > of pg_rewind are, well, dashed and not underscored, it seemed to be > better that the other way... The "--"-style options to postgres take either underscores or dashes in the names (just double-checked with PG14), so this shouldn't be a problem. --Jacob
Hi Gunnar, > During a Patroni PR discussion > (https://github.com/zalando/patroni/pull/2225), we realised that if one > wants to use the "-c" option on a typical Debian/Ubuntu installation > (where the config resides below /etc/postgresql/), pg_rewind needs a way > to be told where the postgresql.conf actually is. > > The attached patch implements this as an option. > > [...] Good catch! Could you also implement a TAP test for the new code? -- Best regards, Aleksander Alekseev
Hello Gunnar,
On Thu, 24 Feb 2022 at 10:03, Aleksander Alekseev <aleksander@timescale.com> wrote:
> wants to use the "-c" option on a typical Debian/Ubuntu installation
> (where the config resides below /etc/postgresql/), pg_rewind needs a way
> to be told where the postgresql.conf actually is.
>
> The attached patch implements this as an option.
>
> [...]
Good catch!
Yeah, it is a known problem, and there was already another patch trying to address it [1].
Honestly, I like your approach much better, because, the previous make was assuming that the data_directory GUC is specified in the postgresql.conf, which is not very flexible.
Could you also implement a TAP test for the new code?
+1
+ <para>
+ In case the <filename>postgresql.conf</filename> of your target cluster is not in the
+ target pgdata and you want to use the <option>-c</option> option,
+ provide a (relative or absolute) path to the <filename>postgresql.conf</filename> using this option.
+ </para>
+ In case the <filename>postgresql.conf</filename> of your target cluster is not in the
+ target pgdata and you want to use the <option>-c</option> option,
+ provide a (relative or absolute) path to the <filename>postgresql.conf</filename> using this option.
+ </para>
It took me a while to understand the meaning of <option>-c</option>. Maybe changing it to <option>--restore-target-wal</option> will make it easier to understand.
Regards,
--
Alexander Kukushkin
Hi hackers, > It took me a while to understand the meaning of <option>-c</option>. Maybe changing it to <option>--restore-target-wal</option>will make it easier to understand. +1, I "stumbled" while reading this too at first. -- Best regards, Aleksander Alekseev
> On 24 Feb 2022, at 10:27, Alexander Kukushkin <cyberdemn@gmail.com> wrote: > > Hello Gunnar, > > On Thu, 24 Feb 2022 at 10:03, Aleksander Alekseev <aleksander@timescale.com> wrote: > > > wants to use the "-c" option on a typical Debian/Ubuntu installation > > (where the config resides below /etc/postgresql/), pg_rewind needs a way > > to be told where the postgresql.conf actually is. > > > > The attached patch implements this as an option. > > > > [...] > > Good catch! > > Yeah, it is a known problem, and there was already another patch trying to address it [1]. There is yet another related patch which provides a parameter to pass in restore_command in cases where postgresq.conf isn't reachable: https://commitfest.postgresql.org/37/3213/ -- Daniel Gustafsson https://vmware.com/
Am 24.02.2022 um 14:08 schrieb Daniel Gustafsson: >> On 24 Feb 2022, at 10:27, Alexander Kukushkin <cyberdemn@gmail.com> wrote: >> >> Hello Gunnar, >> >> On Thu, 24 Feb 2022 at 10:03, Aleksander Alekseev <aleksander@timescale.com> wrote: >> >>> wants to use the "-c" option on a typical Debian/Ubuntu installation >>> (where the config resides below /etc/postgresql/), pg_rewind needs a way >>> to be told where the postgresql.conf actually is. >>> >>> The attached patch implements this as an option. >>> >>> [...] >> >> Good catch! >> >> Yeah, it is a known problem, and there was already another patch trying to address it [1]. > > There is yet another related patch which provides a parameter to pass in > restore_command in cases where postgresq.conf isn't reachable: > > https://commitfest.postgresql.org/37/3213/ That looks just as good to me, and it already has tests, so I'll happily step down! (Note to myself: check the CF first next time ;-) -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
> On 24 Feb 2022, at 14:43, Gunnar Nick Bluth <gunnar.bluth@pro-open.de> wrote: > That looks just as good to me, and it already has tests, so I'll happily step down! Actually, I think this looks like a saner approach. Putting a config setting in two place (postgresql.conf and on the commandline for pg_rewind) is a recipe for them diverging. -- Daniel Gustafsson https://vmware.com/
Am 24.02.22 um 14:46 schrieb Daniel Gustafsson: >> On 24 Feb 2022, at 14:43, Gunnar Nick Bluth <gunnar.bluth@pro-open.de> wrote: > >> That looks just as good to me, and it already has tests, so I'll happily step down! > > Actually, I think this looks like a saner approach. Putting a config setting > in two place (postgresql.conf and on the commandline for pg_rewind) is a recipe > for them diverging. Well, well, reverted https://commitfest.postgresql.org/37/3562/ back to "needs review" and moved it to "Replication & Recovery". Patch v2 includes: * doc improvements as suggested * tests shamelessly copied & adapted from https://commitfest.postgresql.org/37/3213/ Since the need is pretty obvious (I *think* the Debian-ish userbase is quite significant alone), I'd love to see one of the approaches making it into 15! Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
Attachment
On Fri, Feb 25, 2022 at 10:35:49AM +0100, Gunnar "Nick" Bluth wrote: > Am 24.02.22 um 14:46 schrieb Daniel Gustafsson: >> Actually, I think this looks like a saner approach. Putting a config setting >> in two place (postgresql.conf and on the commandline for pg_rewind) is a recipe >> for them diverging. FWIW, I have a bad feeling about passing down directly a command through an option itself part of a command, so what's discussed on this thread is refreshing. + } else { + snprintf(postgres_cmd, sizeof(postgres_cmd), + "\"%s\" -D \"%s\" --config_file=\"%s\" -C restore_command", + postgres_exec_path, datadir_target, config_file); + } Shouldn't this one use appendShellString() on config_file? -- Michael
Attachment
Am 26.02.22 um 06:51 schrieb Michael Paquier: > On Fri, Feb 25, 2022 at 10:35:49AM +0100, Gunnar "Nick" Bluth wrote: >> Am 24.02.22 um 14:46 schrieb Daniel Gustafsson: >>> Actually, I think this looks like a saner approach. Putting a config setting >>> in two place (postgresql.conf and on the commandline for pg_rewind) is a recipe >>> for them diverging. > > FWIW, I have a bad feeling about passing down directly a command > through an option itself part of a command, so what's discussed on > this thread is refreshing. Ta. > > + } else { > + snprintf(postgres_cmd, sizeof(postgres_cmd), > + "\"%s\" -D \"%s\" --config_file=\"%s\" -C restore_command", > + postgres_exec_path, datadir_target, config_file); > + } > > Shouldn't this one use appendShellString() on config_file? It probably should, yes. I don't fancy this repetitive code myself. But then, pg_rewind as a whole could use an overhaul. I don't see any use of PQExpBuffer in it, but a lot of char* ... I myself am not a C hacker (as you can probably tell already) and don't really feel fit (enough) to rewrite pg_rewind.c to use fe_utils/string_utils.h :/ I might give it a try anyhow, but that may be a bit... heavy,,, just to add one option? While we're at it (I'm really good at opening cans of worms): 09:47 $ grep -r -l --include \*.c fe_utils/string_utils.h src/bin/ | wc -l 28 09:48 $ grep -r -L --include \*.c fe_utils/string_utils.h src/bin/ | wc -l 66 GSOC project? ;-) Best, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
On Sat, Feb 26, 2022 at 09:55:20AM +0100, Gunnar "Nick" Bluth wrote: > Am 26.02.22 um 06:51 schrieb Michael Paquier: >> Shouldn't this one use appendShellString() on config_file? > > It probably should, yes. I don't fancy this repetitive code myself. > But then, pg_rewind as a whole could use an overhaul. I don't see any use of > PQExpBuffer in it, but a lot of char* ... Having a lot of char* does not necessarily mean that all of them have to be changed to accomodate with PQExpBuffer. My guess that this is a case-by-case, where we should apply that in places where it makes the code cleaner to follow. In the case of your patch though, the two areas changed would make the logic correct, instead, because we have to apply correct quoting rules to any commands executed. > GSOC project? ;-) It does not seem so, though there are surely more areas that could gain in clarity. -- Michael
Attachment
Am 27.02.22 um 13:06 schrieb Michael Paquier: > On Sat, Feb 26, 2022 at 09:55:20AM +0100, Gunnar "Nick" Bluth wrote: >> Am 26.02.22 um 06:51 schrieb Michael Paquier: >>> Shouldn't this one use appendShellString() on config_file? >> >> It probably should, yes. I don't fancy this repetitive code myself. >> But then, pg_rewind as a whole could use an overhaul. I don't see any use of >> PQExpBuffer in it, but a lot of char* ... > > Having a lot of char* does not necessarily mean that all of them have > to be changed to accomodate with PQExpBuffer. My guess that this is a > case-by-case, where we should apply that in places where it makes the > code cleaner to follow. In the case of your patch though, the two > areas changed would make the logic correct, instead, because we have > to apply correct quoting rules to any commands executed. Alas! v3 attached. >> GSOC project? ;-) > > It does not seem so, though there are surely more areas that could > gain in clarity. That's universally true ;-) Best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
Attachment
On Sun, Feb 27, 2022 at 02:30:33PM +0100, Gunnar "Nick" Bluth wrote: > That's universally true ;-) -# Internal routine to enable archive recovery command on a standby node +# Internal routine to enable archive recovery command on a standby node. +# Returns generated restore_command. sub enable_restoring { my ($self, $root_node, $standby) = @_; @@ -1103,7 +1104,7 @@ restore_command = '$copy_command' { $self->set_recovery_mode(); } - return; + return $copy_command; I don't like this change. This makes an existing routine do some extra work for something it is not designed for. You could get this information with a postgres -C command, for example. Now, you cannot use postgres -C because of.. Reasons (1a9d802). But you could use a pg_ctl command for the same purpose. I guess that we'd better refactor this into a new routine of Cluster.pm where we execute a pg_ctl command and return both stdout and stderr back to the caller? The TAP part could be refactored on its own. + In case the <filename>postgresql.conf</filename> of your target cluster is not in the + target pgdata and you want to use the <option>-c / --restore-target-wal</option> option, + provide a (relative or absolute) path to the <filename>postgresql.conf</filename> using this option. This could do a better job to explain in which cases this option can be used for. You could say something like that: "This option specifies a path to a configuration file to be used when starting the target cluster. This makes easier the use of -c/--restore-target-wal by setting restore_command in the extra configuration file specified via this option." Hmm. What about the target cluster started in --single mode as of ensureCleanShutdown()? Should we try to apply this option also in this case? -- Michael
Attachment
Am 28.02.22 um 12:56 schrieb Michael Paquier: > On Sun, Feb 27, 2022 at 02:30:33PM +0100, Gunnar "Nick" Bluth wrote: >> That's universally true ;-) > > -# Internal routine to enable archive recovery command on a standby node > +# Internal routine to enable archive recovery command on a standby node. > +# Returns generated restore_command. > sub enable_restoring > { > my ($self, $root_node, $standby) = @_; > @@ -1103,7 +1104,7 @@ restore_command = '$copy_command' > { > $self->set_recovery_mode(); > } > - return; > + return $copy_command; > > I don't like this change. This makes an existing routine do some > extra work for something it is not designed for. You could get this > information with a postgres -C command, for example. Now, you cannot > use postgres -C because of.. Reasons (1a9d802). But you could use a > pg_ctl command for the same purpose. I guess that we'd better > refactor this into a new routine of Cluster.pm where we execute a > pg_ctl command and return both stdout and stderr back to the caller? Turns out this change isn't needed anyway (it was copied from the other patch). Afterall, the patch is about extracting the restore_command from the config, so... > The TAP part could be refactored on its own. Agreed. I've struggled quite a bit even understanding what's happening in there ;-) > + In case the <filename>postgresql.conf</filename> of your > target cluster is not in the > + target pgdata and you want to use the <option>-c / > --restore-target-wal</option> option, > + provide a (relative or absolute) path to the > <filename>postgresql.conf</filename> using this option. > > This could do a better job to explain in which cases this option can > be used for. You could say something like that: > "This option specifies a path to a configuration file to be used when > starting the target cluster. This makes easier the use of > -c/--restore-target-wal by setting restore_command in the extra > configuration file specified via this option." I reviewed the wording and think it is pretty universal now. > Hmm. What about the target cluster started in --single mode as of > ensureCleanShutdown()? Should we try to apply this option also in > this case? I'd say so; as far as I can tell, "postgres" would deny starting a (Debian/non-standard) cluster the way the code is right now. Which led me to realize we have /* find postgres executable */ rc = find_other_exec(argv0, "postgres", PG_BACKEND_VERSIONSTR, postgres_exec_path); in getRestoreCommand _and_ /* locate postgres binary */ if ((ret = find_other_exec(argv0, "postgres", PG_BACKEND_VERSIONSTR, exec_path)) < 0) in ensureCleanShutdown. Both with the same error messages, AFAICS. D'oh... can of worms. So, how should we call the global "find the ***** 'postgres' executable and boil out if that fails" function? char postgres_exec_path[MAXPGPATH] = findPostgresExec(); ? Anyway, v4 attached so we can settle on the tests and wording... -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
Attachment
On Mon, Feb 28, 2022 at 08:48:23PM +0100, Gunnar "Nick" Bluth wrote: > So, how should we call the global "find the ***** 'postgres' executable and > boil out if that fails" function? > char postgres_exec_path[MAXPGPATH] = findPostgresExec(); > ? That would mean only a couple of lines gained, and the readability gained is minimal, so I'd be fine to keep the code as-is. I am not sure about the full patch set yet, but the refactoring of the commands to use PQExpBuffer is good by itself, so I have extracted this part of the patch and applied that for now. -- Michael
Attachment
Am 01.03.22 um 05:18 schrieb Michael Paquier: > On Mon, Feb 28, 2022 at 08:48:23PM +0100, Gunnar "Nick" Bluth wrote: >> So, how should we call the global "find the ***** 'postgres' executable and >> boil out if that fails" function? >> char postgres_exec_path[MAXPGPATH] = findPostgresExec(); >> ? > > That would mean only a couple of lines gained, and the readability > gained is minimal, so I'd be fine to keep the code as-is. I am not > sure about the full patch set yet, but the refactoring of the commands > to use PQExpBuffer is good by itself, so I have extracted this part of > the patch and applied that for now. Which made the remaining patch v5 (which also appends the --config-file for ensureCleanShutdown) much cleaner :) Cheers, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
Attachment
Heya, some minor comments, I didn't look at the added test and I did not test the patch at all, but (as part of the Debian/Ubuntu packaging team) I think this patch is really important: On Tue, Mar 01, 2022 at 12:35:27PM +0100, Gunnar "Nick" Bluth wrote: > diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml > index 33e6bb64ad..d0278e9b46 100644 > --- a/doc/src/sgml/ref/pg_rewind.sgml > +++ b/doc/src/sgml/ref/pg_rewind.sgml > @@ -241,6 +241,18 @@ PostgreSQL documentation > </listitem> > </varlistentry> > > + <varlistentry> > + <term><option>--config-file=<replaceable class="parameter">filepath</replaceable></option></term> > + <listitem> > + <para> > + When using the <option>-c / --restore-target-wal</option> option, the <varname>restore_command</varname> > + is extracted from the configuration of the target cluster. If the <filename>postgresql.conf</filename> > + of that cluster is not in the target pgdata directory (or if you want to use an alternative config), I think we usually just say "data directory"; pgdata was previously only used as part of the option name, not like this in the text. > diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c > index b39b5c1aac..294353a9d5 100644 > --- a/src/bin/pg_rewind/pg_rewind.c > +++ b/src/bin/pg_rewind/pg_rewind.c > @@ -61,6 +61,7 @@ char *datadir_target = NULL; > char *datadir_source = NULL; > char *connstr_source = NULL; > char *restore_command = NULL; > +char *config_file = NULL; > > static bool debug = false; > bool showprogress = false; > @@ -87,6 +88,7 @@ usage(const char *progname) > printf(_("Options:\n")); > printf(_(" -c, --restore-target-wal use restore_command in target configuration to\n" > " retrieve WAL files from archives\n")); > + printf(_(" --config-file=FILE path to postgresql.conf if outside target-pgdata (for -c)\n")); > printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to modify\n")); > printf(_(" --source-pgdata=DIRECTORY source data directory to synchronize with\n")); > printf(_(" --source-server=CONNSTR source server to synchronize with\n")); target-pgdata is the name of the option, but maybe it is useful to spell out "target data directory" here, even if that means we spill over to two lines (which we might have to do anyway, that new line is pretty long). > @@ -121,6 +123,7 @@ main(int argc, char **argv) > {"no-sync", no_argument, NULL, 'N'}, > {"progress", no_argument, NULL, 'P'}, > {"debug", no_argument, NULL, 3}, > + {"config-file", required_argument, NULL, 5}, Not sure what our policy is, but the way the numbered options are 1, 2, 4, 3, 5 after this is weird; this was already off before this patch though. > {NULL, 0, NULL, 0} > }; > int option_index; > @@ -205,6 +208,10 @@ main(int argc, char **argv) > case 4: > no_ensure_shutdown = true; > break; > + > + case 5: > + config_file = pg_strdup(optarg); > + break; > } > } > > @@ -1060,7 +1067,11 @@ getRestoreCommand(const char *argv0) > /* add -D switch, with properly quoted data directory */ > appendPQExpBufferStr(postgres_cmd, " -D "); > appendShellString(postgres_cmd, datadir_target); > - > + if (config_file != NULL) > + { > + appendPQExpBufferStr(postgres_cmd, " --config_file="); > + appendShellString(postgres_cmd, config_file); > + } > /* add -C switch, for restore_command */ You removed an empty line here. Maybe rather prepend this with a comment on what's going on, and have en empty line before and afterwards. > appendPQExpBufferStr(postgres_cmd, " -C restore_command"); > > @@ -1138,6 +1149,11 @@ ensureCleanShutdown(const char *argv0) > /* add set of options with properly quoted data directory */ > appendPQExpBufferStr(postgres_cmd, " --single -F -D "); > appendShellString(postgres_cmd, datadir_target); > + if (config_file != NULL) > + { > + appendPQExpBufferStr(postgres_cmd, " --config_file="); > + appendShellString(postgres_cmd, config_file); > + } > > /* finish with the database name, and a properly quoted redirection */ Kinde same here. Cheers, Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Am 10.03.22 um 14:43 schrieb Michael Banck: > Heya, > > some minor comments, I didn't look at the added test and I did not test > the patch at all, but (as part of the Debian/Ubuntu packaging team) I > think this patch is really important: Much appreciated! [...] > I think we usually just say "data directory"; pgdata was previously only > used as part of the option name, not like this in the text. Fixed. [...] > target-pgdata is the name of the option, but maybe it is useful to spell > out "target data directory" here, even if that means we spill over to > two lines (which we might have to do anyway, that new line is pretty > long). Fixed. > >> @@ -121,6 +123,7 @@ main(int argc, char **argv) >> {"no-sync", no_argument, NULL, 'N'}, >> {"progress", no_argument, NULL, 'P'}, >> {"debug", no_argument, NULL, 3}, >> + {"config-file", required_argument, NULL, 5}, > > Not sure what our policy is, but the way the numbered options are 1, 2, > 4, 3, 5 after this is weird; this was already off before this patch > though. That one has been triggering my inner Monk too ;-) Fixed! [...] > You removed an empty line here. Maybe rather prepend this with a comment > on what's going on, and have en empty line before and afterwards. > Kinde same here. It does smell a bit like "stating the obvious", but alas! Both fixed. Cheers, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
Attachment
Hi, Am Donnerstag, dem 10.03.2022 um 15:28 +0100 schrieb Gunnar "Nick" Bluth: > Am 10.03.22 um 14:43 schrieb Michael Banck: > > some minor comments, I didn't look at the added test and I did not > > test the patch at all, but (as part of the Debian/Ubuntu packaging > > team) I think this patch is really important: > > Much appreciated! > > [...] > > Fixed. > > [...] Thanks for the updated patch. The patch applies, make is ok, make check is ok, pg_rewind TAP tests are ok. I did some tests now with Patroni 2.1.3 and the attached patch applied. The following test was made: 1. Deploy 3-node (pg1, pg2, pg3) patroni cluster on Debian unstable running postgresql-15 (approx. master) 2. Switch on archive_mode, and set archive_command and restore_command 3. Switchover so that pg1 is the primary (if not already) 4. Kill pg1 hard 5. Wait till a new leader has taken over and the (now 2-node) cluster is healthy again 6. Restart pg1 without this patch: |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,084 INFO: running pg_rewind from pg3 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,102 INFO: running pg_rewind from dbname=postgres user=postgres_rewindhost=10.0.3.227 port=5432 target_session_attrs=read-write |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,135 INFO: pg_rewind exit code=1 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,136 INFO: stdout= |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,136 INFO: stderr=postgres: could not access the server configurationfile "/var/lib/postgresql/15/main/postgresql.conf": No such file or directory |Apr 03 19:09:01 pg1 patroni@15-main[99]: no data was returned by command "/usr/lib/postgresql/15/bin/postgres -D /var/lib/postgresql/15/main-C restore_command" |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,149 ERROR: Failed to rewind from healty master: pg3 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,149 WARNING: remove_data_directory_on_diverged_timelines isset. removing... |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,149 INFO: Removing data directory: /var/lib/postgresql/15/main |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,245 INFO: Lock owner: pg3; I am pg1 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,248 INFO: trying to bootstrap from leader 'pg3' So pg_rewind fails and Patroni does a re-bootstrap. with this patch: |Apr 03 19:12:35 pg1 patroni@15-main[99]: 2022-04-03 19:12:35,576 INFO: running pg_rewind from pg2 |Apr 03 19:12:35 pg1 patroni@15-main[99]: 2022-04-03 19:12:35,590 INFO: running pg_rewind from dbname=postgres user=postgres_rewindhost=10.0.3.145 port=5432 target_session_attrs=read-write |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,147 INFO: pg_rewind exit code=0 |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,148 INFO: stdout= |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,148 INFO: stderr=pg_rewind: servers diverged at WAL location0/1D000180 on timeline 38 |Apr 03 19:12:37 pg1 patroni@15-main[99]: pg_rewind: rewinding from last common checkpoint at 0/1D000108 on timeline 38 |Apr 03 19:12:37 pg1 patroni@15-main[99]: pg_rewind: Done! |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,151 WARNING: Postgresql is not running. |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,151 INFO: Lock owner: pg2; I am pg1 Here, pg_rewind is called and works fine. So I think this works as intended, and I'm marking it Ready for Committer. Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 E-Mail: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment
Just reposting the previous patches to straighten out the cfbot.
Attachment
On Sun, Apr 03, 2022 at 09:31:32PM +0200, Michael Banck wrote: > The patch applies, make is ok, make check is ok, pg_rewind TAP tests > are ok. + appendPQExpBufferStr(postgres_cmd, " --config_file="); + appendShellString(postgres_cmd, config_file); Nit. This is a valid option for the postgres command, but I'd rather use a -c switch instead, mostly as a matter of self-documentation. In the tests, the only difference between the modes "archive_cli" and "archive" is the extra option given to the pg_rewind command, and that's a simple redirect to what pg_rewind would choose by default anyway. A more elegant solution would be to have an extra option in RewindTest::run_pg_rewind(), where any caller of the routine can feed extra options to the pg_rewind command. Now, in this case, we are not going to lose any coverage if the existing "archive" mode is changed so as we move away postgresql.conf from the target data folder and just use --config-file by default, no? I would make the choice of simplicity, by giving up on "archive_cli" and using primary-postgresql.conf.tmp as value for --config-file. There could be an argument for using --config-file for all the modes, as well. -- Michael
Attachment
On Tue, Apr 05, 2022 at 11:10:30AM +0900, Michael Paquier wrote: > In the tests, the only difference between the modes "archive_cli" and > "archive" is the extra option given to the pg_rewind command, and > that's a simple redirect to what pg_rewind would choose by default > anyway. A more elegant solution would be to have an extra option in > RewindTest::run_pg_rewind(), where any caller of the routine can feed > extra options to the pg_rewind command. Now, in this case, we are > not going to lose any coverage if the existing "archive" mode is > changed so as we move away postgresql.conf from the target data folder > and just use --config-file by default, no? I would make the choice of > simplicity, by giving up on "archive_cli" and using > primary-postgresql.conf.tmp as value for --config-file. There could > be an argument for using --config-file for all the modes, as well. The clock is ticking, so I have looked at this patch by myself. I have finished by tweaking the docs, the code and the tests as of the attached. One thing that I found annoying is the lack of description about the fact that this option affects pg_rewind when it internally starts the target cluster, as of when we retrieve restore_command and when we enforce crash recovery to have a target cluster with a clean shutdown state. The tests are heavily simplified, having no impact on the run-time while improving the coverage for the code paths changed here. -- Michael
Attachment
Am 06.04.22 um 14:04 schrieb Michael Paquier: > On Tue, Apr 05, 2022 at 11:10:30AM +0900, Michael Paquier wrote: >> In the tests, the only difference between the modes "archive_cli" and >> "archive" is the extra option given to the pg_rewind command, and >> that's a simple redirect to what pg_rewind would choose by default >> anyway. A more elegant solution would be to have an extra option in >> RewindTest::run_pg_rewind(), where any caller of the routine can feed >> extra options to the pg_rewind command. Now, in this case, we are >> not going to lose any coverage if the existing "archive" mode is >> changed so as we move away postgresql.conf from the target data folder >> and just use --config-file by default, no? I would make the choice of >> simplicity, by giving up on "archive_cli" and using >> primary-postgresql.conf.tmp as value for --config-file. There could >> be an argument for using --config-file for all the modes, as well. > > The clock is ticking, so I have looked at this patch by myself. I Ta! Sorry, been ridiculously busy these days, and I do confess that I've been struggling a bit with those tests before ;-) > have finished by tweaking the docs, the code and the tests as of the > attached. One thing that I found annoying is the lack of description > about the fact that this option affects pg_rewind when it internally > starts the target cluster, as of when we retrieve restore_command and > when we enforce crash recovery to have a target cluster with a clean > shutdown state. The tests are heavily simplified, having no impact on > the run-time while improving the coverage for the code paths changed > here. > -- > Michael LGTM! Thx again! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato
On Wed, Apr 06, 2022 at 02:32:44PM +0200, Gunnar "Nick" Bluth wrote: > Ta! Sorry, been ridiculously busy these days, and I do confess that I've > been struggling a bit with those tests before ;-) No problem. Double-checked this morning and applied, then. -- Michael
Attachment
Am 07.04.22 um 01:54 schrieb Michael Paquier: > On Wed, Apr 06, 2022 at 02:32:44PM +0200, Gunnar "Nick" Bluth wrote: >> Ta! Sorry, been ridiculously busy these days, and I do confess that I've >> been struggling a bit with those tests before ;-) > > No problem. Double-checked this morning and applied, then. > -- > Michael Wonderful! So, @cyberdemn, who would have thought we get this fixed so quickly? Time to update the Patroni PR once more! ;-) Thanks & best regards, -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de __________________________________________________________________________ "Ceterum censeo SystemD esse delendam" - Cato