RE: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers
From | Hayato Kuroda (Fujitsu) |
---|---|
Subject | RE: [PoC] pg_upgrade: allow to upgrade publisher node |
Date | |
Msg-id | TYAPR01MB5866DD3348B5224E0A1BFC3EF51CA@TYAPR01MB5866.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: [PoC] pg_upgrade: allow to upgrade publisher node (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: [PoC] pg_upgrade: allow to upgrade publisher node
Re: [PoC] pg_upgrade: allow to upgrade publisher node Re: [PoC] pg_upgrade: allow to upgrade publisher node |
List | pgsql-hackers |
Dear Peter, Thanks for giving comments! PSA the new version. > ====== 1. GENERAL Please try to run a spell/grammar check on all the text like commit message and docs changes before posting (e.g. cut/pastethe rendered text into some tool like MSWord or Grammarly or ChatGPT or whatever tool you like and cross-check).There are lots of small typos etc but one up-front check could avoid long cycles of reviewing/reporting/fixing/re-posting/confirming... > I checked all of sentences for Grammarly. Sorry for poor English. > ====== Commit message 2. Note that slot restoration must be done after the final pg_resetwal command during the upgrade because pg_resetwal will remove WALs that are required by the slots. Due to ths restriction, the timing of restoring replication slots is different from other objects. ~ /ths/this/ > Fixed. > doc/src/sgml/ref/pgupgrade.sgml 3. + <para> + Before you start upgrading the publisher cluster, ensure that the + subscription is temporarily disabled, by executing + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... DISABLE</command></link>. + After the upgrade is complete, then re-enable the subscription. Note that + if the new cluser uses different port number from old one, + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... CONNECTION</command></link> + command must be also executed on subscriber. + </para> 3a. BEFORE After the upgrade is complete, then re-enable the subscription. SUGGESTION Re-enable the subscription after the upgrade. > Fixed. > 3b. /cluser/cluster/ ~ 3c. Note that + if the new cluser uses different port number from old one, + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... CONNECTION</command></link> + command must be also executed on subscriber. SUGGESTION Note that if the new cluster uses a different port number ALTER SUBSCRIPTION ... CONNECTION command must be also executedon the subscriber. > The part was removed. > 4. + <listitem> + <para> + <structfield>confirmed_flush_lsn</structfield> (see <xref linkend="view-pg-replication-slots"/>) + of all slots on old cluster must be same as latest checkpoint location. + </para> + </listitem> 4a. /on old cluster/on the old cluster/ > Fixed. > 4b. /as latest/as the latest/ > Fixed. > 5. + <listitem> + <para> + The output plugins referenced by the slots on the old cluster must be + installed on the new PostgreSQL executable directory. + </para> + </listitem> /installed on/installed in/ ?? > "installed in" is better, fixed. > 6. + <listitem> + <para> + The new cluster must have + <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link> + configured to value larger than the existing slots on the old cluster. + </para> + </listitem> BEFORE ...to value larger than the existing slots on the old cluster. SUGGESTION ...to a value greater than or equal to the number of slots present on the old cluster. > Fixed. > src/bin/pg_upgrade/check.c 7. GENERAL - check_for_logical_replication_slots AFAICT this function is called *only* for the new_cluster, yet there is no Assert and no checking inside this function toensure that is the case or not. It seems strange that the *cluster is passed as an argument but then the whole functionbody and messages assume it can only be a new cluster anyway. IMO it would be better to rename this function to something like check_new_cluster_logical_replication_slots() and DO NOTpass any parameter but just use the global new_cluster within the function body. > Hmm, I followed other functions, e.g., check_for_composite_data_type_usage() is called only for old one but it has an argument *cluster. What is the difference between them? Moreover, how about check_for_lost_slots() and check_for_confirmed_flush_lsn()? Fixed for the moment. > 8. check_for_logical_replication_slots + /* logical replication slots can be migrated since PG17. */ + if (GET_MAJOR_VERSION(cluster->major_version) < 1700) + return; Start comment with uppercase for consistency. > The part was removed. > 9. check_for_logical_replication_slots + res = executeQueryOrDie(conn, "SELECT slot_name " + "FROM pg_catalog.pg_replication_slots " + "WHERE slot_type = 'logical' AND " + "temporary IS FALSE;"); + + if (PQntuples(res)) + pg_fatal("New cluster must not have logical replication slot, but found \"%s\"", + PQgetvalue(res, 0, 0)); /replication slot/replication slots/ > Fixed. > 10. check_for_logical_replication_slots + /* + * Do additional checks when the logical replication slots have on the old + * cluster. + */ + if (nslots) SUGGESTION Do additional checks when there are logical replication slots on the old cluster. > Per suggestion from Amit, the part was removed. > 11. + if (nslots > max_replication_slots) + pg_fatal("max_replication_slots must be greater than or equal to existing logical " + "replication slots on old cluster."); 11a. SUGGESTION max_replication_slots (%d) must be greater than or equal to the number of logical replication slots (%d) on the old cluster. 11b. I think it would be helpful for the current values to be displayed in the fatal message so the user will know more aboutwhat value to set. Notice that my above suggestion has some substitution markers. > Changed. > src/bin/pg_upgrade/info.c 12. +static void +print_slot_infos(LogicalSlotInfoArr *slot_arr) +{ + int slotnum; + + for (slotnum = 0; slotnum < slot_arr->nslots; slotnum++) + { + LogicalSlotInfo *slot_info = &slot_arr->slots[slotnum]; + pg_log(PG_VERBOSE, "slotname: \"%s\", plugin: \"%s\", two_phase: %d", + slot_info->slotname, + slot_info->plugin, + slot_info->two_phase); + } +} Better to have a blank line after the 'slot_info' declaration. > Added. > .../pg_upgrade/t/http://003_logical_replication_slots.pl 13. +# ------------------------------ +# TEST: Confirm pg_upgrade fails when new cluster wal_level is not 'logical' + +# Create a slot on old cluster +$old_publisher->start; +$old_publisher->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('test_slot1', 'test_decoding', false, true);" +); +$old_publisher->stop; 13a. It would be nicer if all the test parts have identical formats. So here it should also say # Preparations for the subsequent test: # 1. Create a slot on the old cluster > I did not use because there was only one step, but followed the style. > 13b. Notice the colon (:) at the end of that comment "Preparations for the subsequent test:". All the other preparation commentsin this file should also have a colon. > Added. > 14. +# Cause a failure at the start of pg_upgrade because wal_level is replica SUGGESTION # pg_upgrade will fail because the new cluster wal_level is 'replica' > Fixed. > 15. +# 1. Create an unnecessary slot on the old cluster (but it is not unnecessary -- it is necessary for this test!) SUGGESTION +# 1. Create a second slot on the old cluster > Fixed. > 16. +# Cause a failure at the start of pg_upgrade because the new cluster has +# insufficient max_replication_slots SUGGESTION # pg_upgrade will fail because the new cluster has insufficient max_replication_slots > Fixed. > 17. +# Preparations for the subsequent test. +# 1. Remove an unnecessary slot SUGGESTION +# 1. Remove the slot 'test_slot2', leaving only 1 slot remaining on the old cluster, so the new cluster config max_replication_slots=1will now be enough. > Fixed. > 18. +$new_publisher->start; +my $result = $new_publisher->safe_psql('postgres', + "SELECT slot_name, two_phase FROM pg_replication_slots"); +is($result, qq(test_slot1|t), 'check the slot exists on new cluster'); +$new_publisher->stop; + +done_testing(); Maybe should be some added comments like: # Check that the slot 'test_slot1' has migrated to the new cluster. > Added. Best Regards, Hayato Kuroda FUJITSU LIMITED
Attachment
pgsql-hackers by date: