Re: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Peter Smith
Subject Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id CAHut+PuS8M4z+adLhk=w0PEvNSmZ3AbYR+_S01Arc5ZA2amvuA@mail.gmail.com
Whole thread Raw
In response to RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses RE: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
Here are some review comments for v52-0001

======
src/bin/pg_upgrade/t/003_upgrade_logical_replication_slots.pl

1.
+ # 2. max_replication_slots is set to smaller than the number of slots (2)
+ # present on the old cluster

SUGGESTION
2. Set 'max_replication_slots' to be less than the number of slots (2)
present on the old cluster.

~~~

2.
+ # Set max_replication_slots to the same value as the number of slots. Both
+ # of slots will be used for subsequent tests.

SUGGESTION
Set 'max_replication_slots' to match the number of slots (2) present
on the old cluster.
Both slots will be used for subsequent tests.

~~~

3.
+ # 3. Emit a non-transactional message. test_slot2 detects the message so
+ # that this slot will be also reported by upcoming pg_upgrade.
+ $old_publisher->safe_psql('postgres',
+ "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
'This is a non-transactional message');"
+ );

SUGGESTION
3. Emit a non-transactional message. This will cause test_slot2 to
detect the unconsumed WAL record.

~~~

4.
+ # Preparations for the subsequent test:
+ # 1. Generate extra WAL records. At this point neither test_slot1 nor
+ # test_slot2 has consumed them.
+ $old_publisher->start;
+ $old_publisher->safe_psql('postgres',
+ "CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;");
+
+ # 2. Advance the slot test_slot2 up to the current WAL location, but
+ # test_slot1 still has unconsumed WAL records.
+ $old_publisher->safe_psql('postgres',
+ "SELECT pg_replication_slot_advance('test_slot2', NULL);");
+
+ # 3. Emit a non-transactional message. test_slot2 detects the message so
+ # that this slot will be also reported by upcoming pg_upgrade.
+ $old_publisher->safe_psql('postgres',
+ "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
'This is a non-transactional message');"
+ );
+
+ $old_publisher->stop;

All of the above are sequentially executed on the
old_publisher->safe_psql, so consider if it is worth combining them
all in a single call (keeping the comments 1,2,3 separate still)

For example,

$old_publisher->start;
$old_publisher->safe_psql('postgres', qq[
  CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
  SELECT pg_replication_slot_advance('test_slot2', NULL);
  SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
'This is a non-transactional message');
]);
$old_publisher->stop;

~~~

5.
+ # Clean up
+ $subscriber->stop();
+ $new_publisher->stop();

Should this also drop the 'test_slot1' and 'test_slot2'?

~~~

6.
+# Verify that logical replication slots cannot be migrated.  This function
+# will be executed when the old cluster is PG16 and prior.
+sub test_upgrade_from_pre_PG17
+{
+ my ($old_publisher, $new_publisher, $mode) = @_;
+
+ my $oldbindir = $old_publisher->config_data('--bindir');
+ my $newbindir = $new_publisher->config_data('--bindir');

SUGGESTION (let's not mention lots of different numbers; just refer to 17)
This function will be executed when the old cluster version is prior to PG17.

~~

7.
+ # Actual run, successful upgrade is expected
+ command_ok(
+ [
+ 'pg_upgrade', '--no-sync',
+ '-d', $old_publisher->data_dir,
+ '-D', $new_publisher->data_dir,
+ '-b', $oldbindir,
+ '-B', $newbindir,
+ '-s', $new_publisher->host,
+ '-p', $old_publisher->port,
+ '-P', $new_publisher->port,
+ $mode,
+ ],
+ 'run of pg_upgrade of old cluster');
+
+ ok( !-d $new_publisher->data_dir . "/pg_upgrade_output.d",
+ "pg_upgrade_output.d/ removed after pg_upgrade success");

7a.
The comment is wrong?

SUGGESTION
# pg_upgrade should NOT be successful

~

7b.
There is a blank line here before the ok() function, but in the other
tests, there was none. Better to be consistent.

~~~

8.
+ # Clean up
+ $new_publisher->stop();

Should this also drop the 'test_slot'?

~~~

9.
+# The TAP Cluster.pm assigns default 'max_wal_senders' and 'max_connections' to
+# the same value (10) but PG12 and prior considered max_walsenders as a subset
+# of max_connections, so setting the same value will fail.
+if ($old_publisher->pg_version->major < 12)
+{
+ $old_publisher->append_conf(
+ 'postgresql.conf', qq[
+ max_wal_senders = 5
+ max_connections = 10
+ ]);
+}

If the comment is correct, then PG12 *and* prior, should be testing
"<= 12", not "< 12". right?

~~~

10.
+# Test according to the major version of the old cluster.
+# Upgrading logical replication slots has been supported only since PG17.
+if ($old_publisher->pg_version->major >= 17)

This comment seems wrong IMO. I think we always running the latest
version of pg_upgrade so slot migration is always "supported" from now
on. IIUC you intended this comment to be saying something about the
old_publisher slots.

BEFORE
Upgrading logical replication slots has been supported only since PG17.

SUGGESTION
Upgrading logical replication slots from versions older than PG17 is
not supported.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: run pgindent on a regular basis / scripted manner
Next
From: Andres Freund
Date:
Subject: Re: run pgindent on a regular basis / scripted manner