Re: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Peter Smith
Subject Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id CAHut+Ps0t7tA+APT5e7jBG+wMFv2ic+a4rayhfOsFV40B=1=0A@mail.gmail.com
Whole thread Raw
In response to RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: [PoC] pg_upgrade: allow to upgrade publisher node
RE: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
Here are some review comments for v51-0001

======
src/bin/pg_upgrade/check.c

0.
+check_old_cluster_for_valid_slots(bool live_check)
+{
+ char output_path[MAXPGPATH];
+ FILE    *script = NULL;
+
+ prep_status("Checking for valid logical replication slots");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "invalid_logical_relication_slots.txt");

0a
typo /invalid_logical_relication_slots/invalid_logical_replication_slots/

~

0b.
Since the non-upgradable slots are not strictly "invalid", is this an
appropriate filename for the bad ones?

But I don't have very good alternatives. Maybe:
- non_upgradable_logical_replication_slots.txt
- problem_logical_replication_slots.txt

======
src/bin/pg_upgrade/t/003_upgrade_logical_replication_slots.pl

1.
+# ------------------------------
+# TEST: Confirm pg_upgrade fails when wrong GUC is set on new cluster
+#
+# There are two requirements for GUCs - wal_level and max_replication_slots,
+# but only max_replication_slots will be tested here. This is because to
+# reduce the execution time of the test.


SUGGESTION
# TEST: Confirm pg_upgrade fails when the new cluster has wrong GUC values.
#
# Two GUCs are required - 'wal_level' and 'max_replication_slots' - but to
# reduce the test execution time, only 'max_replication_slots' is tested here.

~~~

2.
+# Preparations for the subsequent test:
+# 1. Create two slots on the old cluster
+$old_publisher->start;
+$old_publisher->safe_psql('postgres',
+ "SELECT pg_create_logical_replication_slot('test_slot1',
'test_decoding', false, true);"
+);
+$old_publisher->safe_psql('postgres',
+ "SELECT pg_create_logical_replication_slot('test_slot2',
'test_decoding', false, true);"
+);


Can't you combine those SQL in the same $old_publisher->safe_psql.

~~~

3.
+# Clean up
+rmtree($new_publisher->data_dir . "/pg_upgrade_output.d");
+# Set max_replication_slots to the same value as the number of slots. Both of
+# slots will be used for subsequent tests.
+$new_publisher->append_conf('postgresql.conf', "max_replication_slots = 1");

The code doesn't seem to match the comment - is this correct? The
old_publisher created 2 slots, so why are you setting new_publisher
"max_replication_slots = 1" again?

~~~

4.
+# Preparations for the subsequent test:
+# 1. Generate extra WAL records. Because these WAL records do not get consumed
+# it will cause the upcoming pg_upgrade test to fail.
+$old_publisher->start;
+$old_publisher->safe_psql('postgres',
+ "CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;");
+
+# 2. Advance the slot test_slot2 up to the current WAL location
+$old_publisher->safe_psql('postgres',
+ "SELECT pg_replication_slot_advance('test_slot2', NULL);");
+
+# 3. Emit a non-transactional message. test_slot2 detects the message so that
+# this slot will be also reported by upcoming pg_upgrade.
+$old_publisher->safe_psql('postgres',
+ "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
'This is a non-transactional message');"
+);


I felt this test would be clearer if you emphasised the state of the
test_slot1 also. e.g.

4a.
BEFORE
+# 1. Generate extra WAL records. Because these WAL records do not get consumed
+# it will cause the upcoming pg_upgrade test to fail.

SUGGESTION
# 1. Generate extra WAL records. At this point neither test_slot1 nor test_slot2
#    has consumed them.

~

4b.
BEFORE
+# 2. Advance the slot test_slot2 up to the current WAL location

SUGGESTION
# 2. Advance the slot test_slot2 up to the current WAL location, but test_slot2
#    still has unconsumed WAL records.

~~~

5.
+# pg_upgrade will fail because the slot still has unconsumed WAL records
+command_checks_all(

/because the slot still has/because there are slots still having/

~~~

6.
+ [qr//],
+ 'run of pg_upgrade of old cluster with slot having unconsumed WAL records'
+);

/slot/slots/

~~~

7.
+# And check the content. Both of slots must be reported that they have
+# unconsumed WALs after confirmed_flush_lsn.

SUGGESTION
# Check the file content. Both slots should be reporting that they have
# unconsumed WAL records.


~~~

8.
+# Preparations for the subsequent test:
+# 1. Setup logical replication
+my $old_connstr = $old_publisher->connstr . ' dbname=postgres';
+
+$old_publisher->start;
+
+$old_publisher->safe_psql('postgres',
+ "SELECT * FROM pg_drop_replication_slot('test_slot1');");
+$old_publisher->safe_psql('postgres',
+ "SELECT * FROM pg_drop_replication_slot('test_slot2');");
+
+$old_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION regress_pub FOR ALL TABLES;");


8a.
/Setup logical replication/Setup logical replication (first, cleanup
slots from the previous tests)/

~

8b.
Can't you combine all those SQL in the same $old_publisher->safe_psql.

~~~

9.
+
+# Actual run, successful upgrade is expected
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',
+ '-d', $old_publisher->data_dir,
+ '-D', $new_publisher->data_dir,
+ '-b', $bindir,
+ '-B', $bindir,
+ '-s', $new_publisher->host,
+ '-p', $old_publisher->port,
+ '-P', $new_publisher->port,
+ $mode,
+ ],
+ 'run of pg_upgrade of old cluster');

Now that the "Dry run" part is removed, it seems unnecessary to say
"Actual run" for this part.


SUGGESTION
# pg_upgrade should be successful.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: New WAL record to detect the checkpoint redo location
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: The danger of deleting backup_label