RE: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id TYAPR01MB58661B5F7DF15C2B5D0A2E16F5C3A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
Dear Bharath,

Again, thank you for reviewing! PSA a new version.

> 
> Here are some more comments/thoughts on the v44 patch:
> 
> 1.
> +# pg_upgrade will fail because the slot still has unconsumed WAL records
> +command_fails(
> +    [
> 
> Add a test case to hit fprintf(script, "The slot \"%s\" is invalid\n",
> file as well?

Added. The test was not added because 002_pg_upgrade.pl did not do similar checks,
but it is worth verifying. One difficulty was that output directory had millisecond
timestamp, so the absolute path could not be predicted. So File::Find::find was
used to detect the file.

> 2.
> +    'run of pg_upgrade where the new cluster has insufficient
> max_replication_slots');
> +ok( -d $new_publisher->data_dir . "/pg_upgrade_output.d",
> +    "pg_upgrade_output.d/ not removed after pg_upgrade failure");
> 
> +    'run of pg_upgrade where the new cluster has the wrong wal_level');
> +ok( -d $new_publisher->data_dir . "/pg_upgrade_output.d",
> +    "pg_upgrade_output.d/ not removed after pg_upgrade failure");
> 
> +    'run of pg_upgrade of old cluster with idle replication slots');
> +ok( -d $new_publisher->data_dir . "/pg_upgrade_output.d",
> +    "pg_upgrade_output.d/ not removed after pg_upgrade failure");
> 
> How do these tests recognize the failures are the intended ones? I
> mean, for instance when pg_upgrade fails for unused replication
> slots/unconsumed WAL records, then just looking at the presence of
> pg_upgrade_output.d might not be sufficient, no? Using
> command_fails_like instead of command_fails and looking at the
> contents of invalid_logical_relication_slots.txt might help make these
> tests more focused.

Yeah, currently the output was not checked. I checked and found that pg_upgrade
would output all messages (including error message) to stdout, so
command_fails_like() could not be used. Therefore, command_checks_all() was used
instead.

> 3.
> +        pg_log(PG_REPORT, "fatal");
> +        pg_fatal("Your installation contains invalid logical
> replication slots.\n"
> +                 "These slots can't be copied, so this cluster cannot
> be upgraded.\n"
> +                 "Consider removing such slots or consuming the
> pending WAL if any,\n"
> +                 "and then restart the upgrade.\n"
> +                 "A list of invalid logical replication slots is in
> the file:\n"
> +                 "    %s", output_path);
> 
> It's not just the invalid logical replication slots, but also the
> slots with unconsumed WALs which aren't invalid and can be upgraded if
> ensured the WAL is consumed. So, a better wording would be:
>         pg_fatal("Your installation contains logical replication slots
> that cannot be upgraded.\n"
>                  "List of all such logical replication slots is in the file:\n"
>                  "These slots can't be copied, so this cluster cannot
> be upgraded.\n"
>                  "Consider removing invalid slots and/or consuming the
> pending WAL if any,\n"
>                  "and then restart the upgrade.\n"
>                  "    %s", output_path);

Fixed.

Also, I ran pgperltidy. Some formattings were changed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: Doc: vcregress .bat commands list lacks "taptest"
Next
From: NINGWEI CHEN
Date:
Subject: Re: Remove MSVC scripts from the tree