RE: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id TYAPR01MB5866A537AC9AD46E49345A78F5769@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: [PoC] pg_upgrade: allow to upgrade publisher node
RE: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
Dear Peter,

Thank you for reviewing! PSA new version.

> 
> General.
> 
> 1. pg_dump option is documented to the user.
> 
> I'm not sure about exposing the new pg_dump
> --logical-replication-slots-only option to the user.
> 
> I thought this pg_dump option was intended only to be called
> *internally* by the pg_upgrade.
> But, this patch is also documenting the new option for the user (in
> case they want to call it independently?)
> 
> Maybe exposing it  is OK, but if you do that then I thought perhaps
> there should also be some additional pg_dump tests just for this
> option (i.e. tested independently of the pg_upgrade)

Right, I have written the document for the moment, but it should not
If it is not exposed. Removed from the doc.

> Commit message
> 
> 2.
> For pg_upgrade, when '--include-logical-replication-slots' is
> specified, it executes
> pg_dump with the new "--logical-replication-slots-only" option and
> restores from the
> dump. Apart from restoring schema, pg_resetwal must not be called
> after restoring
> replication slots. This is because the command discards WAL files and
> starts from a
> new segment, even if they are required by replication slots. This
> leads to an ERROR:
> "requested WAL segment XXX has already been removed". To avoid this,
> replication slots
> are restored at a different time than other objects, after running pg_resetwal.
> 
> ~~
> 
> The "Apart from" sentence maybe could do with some rewording. I
> noticed there is a code comment (below fragment) that says the same as
> this, but more clearly. Maybe it is better to use that code-comment
> wording in the comment message.
> 
> + * XXX We cannot dump replication slots at the same time as the schema
> + * dump because we need to separate the timing of restoring
> + * replication slots and other objects. Replication slots, in
> + * particular, should not be restored before executing the pg_resetwal
> + * command because it will remove WALs that are required by the slots.

Changed.

> src/bin/pg_dump/pg_dump.c
> 
> 3. main
> 
> + if (dopt.logical_slots_only && !dopt.binary_upgrade)
> + pg_fatal("options --logical-replication-slots-only requires option
> --binary-upgrade");
> +
> + if (dopt.logical_slots_only && dopt.dataOnly)
> + pg_fatal("options --logical-replication-slots-only and
> -a/--data-only cannot be used together");
> + if (dopt.logical_slots_only && dopt.schemaOnly)
> + pg_fatal("options --logical-replication-slots-only and
> -s/--schema-only cannot be used together");
> +
> 
> Consider if it might be simpler to combine together all those
> dopt.logical_slots_only checks.
> 
> SUGGESTION
> 
> if (dopt.logical_slots_only)
> {
>     if (!dopt.binary_upgrade)
>         pg_fatal("options --logical-replication-slots-only requires
> option --binary-upgrade");
> 
>     if (dopt.dataOnly)
>         pg_fatal("options --logical-replication-slots-only and
> -a/--data-only cannot be used together");
>     if (dopt.schemaOnly)
>         pg_fatal("options --logical-replication-slots-only and
> -s/--schema-only cannot be used together");
> }

Right, fixed.

> 4. getLogicalReplicationSlots
> 
> + /* Check whether we should dump or not */
> + if (fout->remoteVersion < 160000 || !dopt->logical_slots_only)
> + return;
> 
> I'm not sure if this check is necessary. Given the way this function
> is called, is it possible for this check to fail? Maybe that quick
> exit would be better code as an Assert?

I think the version check must be needed because it is not done yet.
(Actually I'm not sure the restriction is needed, but now I will keep)
About dopt->logical_slots_only, I agreed to remove that. 

> 5. dumpLogicalReplicationSlot
> 
> +dumpLogicalReplicationSlot(Archive *fout,
> +    const LogicalReplicationSlotInfo *slotinfo)
> +{
> + DumpOptions *dopt = fout->dopt;
> +
> + if (!dopt->logical_slots_only)
> + return;
> 
> (Similar to the previous comment). Is it even possible to arrive here
> when dopt->logical_slots_only is false. Maybe that quick exit would be
> better coded as an Assert?

I think it is not possible, so changed to Assert().

> 6.
> + PQExpBuffer query = createPQExpBuffer();
> + char    *slotname = pg_strdup(slotinfo->dobj.name);
> 
> I wondered if it was really necessary to strdup/free this slotname.
> e.g. And if it is, then why don't you do this for the slotinfo->plugin
> field?

This was a debris for my testing. Removed.

> src/bin/pg_upgrade/check.c
> 
> 7. check_and_dump_old_cluster
> 
>   /* Extract a list of databases and tables from the old cluster */
>   get_db_and_rel_infos(&old_cluster);
> + get_logical_slot_infos(&old_cluster);
> 
> Is it correct to associate this new call with that existing comment
> about "databases and tables"?

Added a comment.

> 8. check_new_cluster
> 
> @@ -188,6 +190,7 @@ void
>  check_new_cluster(void)
>  {
>   get_db_and_rel_infos(&new_cluster);
> + get_logical_slot_infos(&new_cluster);
> 
>   check_new_cluster_is_empty();
> 
> @@ -210,6 +213,9 @@ check_new_cluster(void)
>   check_for_prepared_transactions(&new_cluster);
> 
>   check_for_new_tablespace_dir(&new_cluster);
> +
> + if (user_opts.include_logical_slots)
> + check_for_parameter_settings(&new_cluster);
> 
> Can the get_logical_slot_infos() be done later, guarded by that the
> same condition if (user_opts.include_logical_slots)?

Added.

> 9. check_new_cluster_is_empty
> 
> + * If --include-logical-replication-slots is required, check the
> + * existing of slots
> + */
> 
> Did you mean to say "check the existence of slots"?

Yes, it is my typo. Fixed.

> 10. check_for_parameter_settings
> 
> + if (strcmp(wal_level, "logical") != 0)
> + pg_fatal("wal_level must be \"logical\", but set to \"%s\"",
> + wal_level);
> 
> /but set to/but is set to/

Fixed.

> src/bin/pg_upgrade/info.c
> 
> 11. get_db_and_rel_infos
> 
> + {
>   get_rel_infos(cluster, &cluster->dbarr.dbs[dbnum]);
> 
> + /*
> + * Additionally, slot_arr must be initialized because they will be
> + * checked later.
> + */
> + cluster->dbarr.dbs[dbnum].slot_arr.nslots = 0;
> + cluster->dbarr.dbs[dbnum].slot_arr.slots = NULL;
> + }
> 
> 11a.
> I think probably it would have been easier to just use 'pg_malloc0'
> instead of 'pg_malloc' in the get_db_infos, then this code would not
> be necessary.

I was not sure whether it is OK to change like that because of the
performance efficiency. But OK, fixed.

> 11b.
> BTW, shouldn't this function also be calling free_logical_slot_infos()
> too? That will also have the same effect (initializing the slot_arr)
> but without having to change anything else.
> 
> ~~~
> 
> 12. get_logical_slot_infos
> +/*
> + * Higher level routine to generate LogicalSlotInfoArr for all databases.
> + */
> +void
> +get_logical_slot_infos(ClusterInfo *cluster)
> 
> To be consistent with the other nearby function headers it should have
> another line saying just get_logical_slot_infos().

Added.

> 13. get_logical_slot_infos
> 
> +void
> +get_logical_slot_infos(ClusterInfo *cluster)
> +{
> + int dbnum;
> +
> + for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> + {
> + if (cluster->dbarr.dbs[dbnum].slot_arr.slots)
> + free_logical_slot_infos(&cluster->dbarr.dbs[dbnum].slot_arr);
> +
> + get_logical_slot_infos_per_db(cluster, &cluster->dbarr.dbs[dbnum]);
> + }
> +
> + if (cluster == &old_cluster)
> + pg_log(PG_VERBOSE, "\nsource databases:");
> + else
> + pg_log(PG_VERBOSE, "\ntarget databases:");
> +
> + if (log_opts.verbose)
> + {
> + for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> + {
> + pg_log(PG_VERBOSE, "Database: %s", cluster->dbarr.dbs[dbnum].db_name);
> + print_slot_infos(&cluster->dbarr.dbs[dbnum].slot_arr);
> + }
> + }
> +}
> 
> I didn't see why there are 2 loops exactly the same. I think with some
> minor refactoring these can both be done in the same loop can't they?

The style follows get_db_and_rel_infos(), but... 

> SUGGESTION 1:
> 
> if (cluster == &old_cluster)
>     pg_log(PG_VERBOSE, "\nsource databases:");
> else
>     pg_log(PG_VERBOSE, "\ntarget databases:");
> 
> for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> {
>     if (cluster->dbarr.dbs[dbnum].slot_arr.slots)
>         free_logical_slot_infos(&cluster->dbarr.dbs[dbnum].slot_arr);
> 
>     get_logical_slot_infos_per_db(cluster, &cluster->dbarr.dbs[dbnum]);
> 
>     if (log_opts.verbose)
>     {
>         pg_log(PG_VERBOSE, "Database: %s",
> cluster->dbarr.dbs[dbnum].db_name);
>         print_slot_infos(&cluster->dbarr.dbs[dbnum].slot_arr);
>     }
> }
> 
> ~
> 
> I expected it could be simplified further still by using some variables
> 
> SUGGESTION 2:
> 
> if (cluster == &old_cluster)
>     pg_log(PG_VERBOSE, "\nsource databases:");
> else
>     pg_log(PG_VERBOSE, "\ntarget databases:");
> 
> for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> {
> DbInfo *pDbInfo = &cluster->dbarr.dbs[dbnum];
>     if (pDbInfo->slot_arr.slots)
>         free_logical_slot_infos(&pDbInfo->slot_arr);
> 
>     get_logical_slot_infos_per_db(cluster, pDbInfo);
> 
>     if (log_opts.verbose)
>     {
>         pg_log(PG_VERBOSE, "Database: %s", pDbInfo->db_name);
>         print_slot_infos(&pDbInfo->slot_arr);
>     }
> }

I chose SUGGESTION 2.

> 14. get_logical_slot_infos_per_db
> 
> + char query[QUERY_ALLOC];
> +
> + query[0] = '\0'; /* initialize query string to empty */
> +
> + snprintf(query + strlen(query), sizeof(query) - strlen(query),
> + "SELECT slot_name, plugin, two_phase "
> + "FROM pg_catalog.pg_replication_slots "
> + "WHERE database = current_database() AND temporary = false "
> + "AND wal_status IN ('reserved', 'extended');");
> 
> I didn't understand the purpose of those calls to 'strlen(query)'
> since the string was initialised to empty-string immediately above.

Removed.

> 15.
> +static void
> +print_slot_infos(LogicalSlotInfoArr *slot_arr)
> +{
> + int slotnum;
> +
> + for (slotnum = 0; slotnum < slot_arr->nslots; slotnum++)
> + pg_log(PG_VERBOSE, "slotname: %s: plugin: %s: two_phase %d",
> +    slot_arr->slots[slotnum].slotname,
> +    slot_arr->slots[slotnum].plugin,
> +    slot_arr->slots[slotnum].two_phase);
> +}
> 
> IMO those colons don't make sense.
> 
> BEFORE
> "slotname: %s: plugin: %s: two_phase %d"
> 
> SUGGESTION
> "slotname: %s, plugin: %s, two_phase: %d"

Fixed. I followed print_rel_infos() style, but I prefer yours.

> src/bin/pg_upgrade/pg_upgrade.h
> 
> 16. LogicalSlotInfo
> 
> +typedef struct
> +{
> + char    *slotname; /* slot name */
> + char    *plugin; /* plugin */
> + bool two_phase; /* Can the slot decode 2PC? */
> +} LogicalSlotInfo;
> 
> The RelInfo had a comment for the typedef struct, so I think the
> LogicalSlotInfo struct also should have a comment.

Added.

> 17. DbInfo
> 
>   RelInfoArr rel_arr; /* array of all user relinfos */
> + LogicalSlotInfoArr slot_arr; /* array of all logicalslotinfos */
>  } DbInfo;
> 
> Should the comment say "LogicalSlotInfo" instead of "logicalslotinfos"?

Right, fixed.

> .../t/003_logical_replication_slots.pl
> 
> 18. RESULTS
> 
> I run this by 'make check' in the src/bin/pg_upgrade folder.
> 
> For some reason, the test does not work for me. The results I get are:
> 
> # +++ tap check in src/bin/pg_upgrade +++
> t/001_basic.pl ...................... ok
> t/002_pg_upgrade.pl ................. ok
> t/003_logical_replication_slots.pl .. 3/? # Tests were run but no plan
> was declared and done_testing() was not seen.
> t/003_logical_replication_slots.pl .. Dubious, test returned 29 (wstat
> 7424, 0x1d00)
> All 4 subtests passed
> 
> Test Summary Report
> -------------------
> t/003_logical_replication_slots.pl (Wstat: 7424 Tests: 4 Failed: 0)
>   Non-zero exit status: 29
>   Parse errors: No plan found in TAP output
> Files=3, Tests=27, 128 wallclock secs ( 0.04 usr  0.01 sys + 18.02
> cusr  6.06 csys = 24.13 CPU)
> Result: FAIL
> make: *** [check] Error 1
> 
> ~
> 
> And the log file
> (tmp_check/log/003_logical_replication_slots_old_node.log) shows the
> following ERROR:
> 
> 2023-05-09 12:19:25.330 AEST [32572] 003_logical_replication_slots.pl
> LOG:  statement: SELECT
> pg_create_logical_replication_slot('test_slot', 'test_decoding',
> false, true);
> 2023-05-09 12:19:25.331 AEST [32572] 003_logical_replication_slots.pl
> ERROR:  could not access file "test_decoding": No such file or
> directory
> 2023-05-09 12:19:25.331 AEST [32572] 003_logical_replication_slots.pl
> STATEMENT:  SELECT pg_create_logical_replication_slot('test_slot',
> 'test_decoding', false, true);
> 2023-05-09 12:19:25.335 AEST [32564] LOG:  received immediate shutdown
> request
> 2023-05-09 12:19:25.337 AEST [32564] LOG:  database system is shut down
> 
> ~
> 
> Is it a bug? Or, if I am doing something wrong please let me know how
> to run the test.

Good point. I could not find the problem because I used meson build system.
When I used the traditional make, the ERROR could be reproduced. 
IIUC the problem was occurred the dependency between pg_upgrade and test_decoding
was not set in the Makefile. Hence, I added a variable EXTRA_INSTALL to Makefile in
order to clarify the dependency. This followed other directories like pg_basebackup.

> 19.
> +# Clean up
> +rmtree($new_node->data_dir . "/pg_upgrade_output.d");
> +$new_node->append_conf('postgresql.conf', "wal_level = 'logical'");
> +$new_node->append_conf('postgresql.conf', "max_replication_slots = 0");
> 
> I think the last 2 lines are not "clean up". They are preparations for
> the subsequent test, so maybe they should be commented as such.

Right, it is a preparation for the next. Added a comment.

> 20.
> +# Clean up
> +rmtree($new_node->data_dir . "/pg_upgrade_output.d");
> +$new_node->append_conf('postgresql.conf', "max_replication_slots = 10");
> 
> I think the last line is not "clean up". It is preparation for the
> subsequent test, so maybe it should be commented as such.

Added a comment.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: base backup vs. concurrent truncation
Next
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply