RE: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id TYCPR01MB587007EA2F9AB92F0E1F5957F5D4A@TYCPR01MB5870.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Peter Smith <smithpb2250@gmail.com>)
Responses RE: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
Dear Peter,

Thanks for reviewing! PSA new version.

> ======
> src/bin/pg_upgrade/t/003_upgrade_logical_replication_slots.pl
> 
> 1.
> + # 2. max_replication_slots is set to smaller than the number of slots (2)
> + # present on the old cluster
> 
> SUGGESTION
> 2. Set 'max_replication_slots' to be less than the number of slots (2)
> present on the old cluster.

Fixed.

> 2.
> + # Set max_replication_slots to the same value as the number of slots. Both
> + # of slots will be used for subsequent tests.
> 
> SUGGESTION
> Set 'max_replication_slots' to match the number of slots (2) present
> on the old cluster.
> Both slots will be used for subsequent tests.

Fixed.

> 
> 3.
> + # 3. Emit a non-transactional message. test_slot2 detects the message so
> + # that this slot will be also reported by upcoming pg_upgrade.
> + $old_publisher->safe_psql('postgres',
> + "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');"
> + );
> 
> SUGGESTION
> 3. Emit a non-transactional message. This will cause test_slot2 to
> detect the unconsumed WAL record.

Fixed.

> 
> 4.
> + # Preparations for the subsequent test:
> + # 1. Generate extra WAL records. At this point neither test_slot1 nor
> + # test_slot2 has consumed them.
> + $old_publisher->start;
> + $old_publisher->safe_psql('postgres',
> + "CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;");
> +
> + # 2. Advance the slot test_slot2 up to the current WAL location, but
> + # test_slot1 still has unconsumed WAL records.
> + $old_publisher->safe_psql('postgres',
> + "SELECT pg_replication_slot_advance('test_slot2', NULL);");
> +
> + # 3. Emit a non-transactional message. test_slot2 detects the message so
> + # that this slot will be also reported by upcoming pg_upgrade.
> + $old_publisher->safe_psql('postgres',
> + "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');"
> + );
> +
> + $old_publisher->stop;
> 
> All of the above are sequentially executed on the
> old_publisher->safe_psql, so consider if it is worth combining them
> all in a single call (keeping the comments 1,2,3 separate still)
> 
> For example,
> 
> $old_publisher->start;
> $old_publisher->safe_psql('postgres', qq[
>   CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
>   SELECT pg_replication_slot_advance('test_slot2', NULL);
>   SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');
> ]);
> $old_publisher->stop;

Fixed.

> 
> 5.
> + # Clean up
> + $subscriber->stop();
> + $new_publisher->stop();
> 
> Should this also drop the 'test_slot1' and 'test_slot2'?

'test_slot1' and 'test_slot2' have already been removed while preparing in
"Successful upgrade" case. Also, I don't think objects have to be removed at the
end. It is tested by other parts, and it may make the test more difficult to
debug, if there are some failures.

> 6.
> +# Verify that logical replication slots cannot be migrated.  This function
> +# will be executed when the old cluster is PG16 and prior.
> +sub test_upgrade_from_pre_PG17
> +{
> + my ($old_publisher, $new_publisher, $mode) = @_;
> +
> + my $oldbindir = $old_publisher->config_data('--bindir');
> + my $newbindir = $new_publisher->config_data('--bindir');
> 
> SUGGESTION (let's not mention lots of different numbers; just refer to 17)
> This function will be executed when the old cluster version is prior to PG17.

Fixed.


> 7.
> + # Actual run, successful upgrade is expected
> + command_ok(
> + [
> + 'pg_upgrade', '--no-sync',
> + '-d', $old_publisher->data_dir,
> + '-D', $new_publisher->data_dir,
> + '-b', $oldbindir,
> + '-B', $newbindir,
> + '-s', $new_publisher->host,
> + '-p', $old_publisher->port,
> + '-P', $new_publisher->port,
> + $mode,
> + ],
> + 'run of pg_upgrade of old cluster');
> +
> + ok( !-d $new_publisher->data_dir . "/pg_upgrade_output.d",
> + "pg_upgrade_output.d/ removed after pg_upgrade success");
> 
> 7a.
> The comment is wrong?
> 
> SUGGESTION
> # pg_upgrade should NOT be successful

No, pg_uprade will success but no logical replication slots are migrated.
Comments docs were added.

> 7b.
> There is a blank line here before the ok() function, but in the other
> tests, there was none. Better to be consistent.

Removed.

> 8.
> + # Clean up
> + $new_publisher->stop();
> 
> Should this also drop the 'test_slot'?

I don't think so. Please see above.

> 
> 9.
> +# The TAP Cluster.pm assigns default 'max_wal_senders' and 'max_connections'
> to
> +# the same value (10) but PG12 and prior considered max_walsenders as a
> subset
> +# of max_connections, so setting the same value will fail.
> +if ($old_publisher->pg_version->major < 12)
> +{
> + $old_publisher->append_conf(
> + 'postgresql.conf', qq[
> + max_wal_senders = 5
> + max_connections = 10
> + ]);
> +}
> 
> If the comment is correct, then PG12 *and* prior, should be testing
> "<= 12", not "< 12". right?

I analyzed more and I was wrong - we must set GUCs here only for PG9.6-.
Regarding PG11 and PG10, the corresponding constructor will be chosen in new() [a],
and these instance will set max_wal_senders to 5 [b]. 
As for PG9.6-, the related package has not been defined yet so that such a
workaround will not be used. So we must set manually.

Actually, the part will be not needed when Cluster.pm supports PG9.6-. If needed
we can start another thread and support them. For now the case is handled ad-hoc.

> 10.
> +# Test according to the major version of the old cluster.
> +# Upgrading logical replication slots has been supported only since PG17.
> +if ($old_publisher->pg_version->major >= 17)
> 
> This comment seems wrong IMO. I think we always running the latest
> version of pg_upgrade so slot migration is always "supported" from now
> on. IIUC you intended this comment to be saying something about the
> old_publisher slots.
> 
> BEFORE
> Upgrading logical replication slots has been supported only since PG17.
> 
> SUGGESTION
> Upgrading logical replication slots from versions older than PG17 is
> not supported.

Fixed.

[a]:
```
    # Use a subclass as defined below (or elsewhere) if this version
    # isn't fully compatible. Warn if the version is too old and thus we don't
    # have a subclass of this class.
    if (ref $ver && $ver < $min_compat)
    {
        my $maj = $ver->major(separator => '_');
        my $subclass = $class . "::V_$maj";
        if ($subclass->isa($class))
        {
            bless $node, $subclass;
        }
```

[b]:
```
sub init
{
    my ($self, %params) = @_;
    $self->SUPER::init(%params);
    $self->adjust_conf('postgresql.conf', 'max_wal_senders',
        $params{allows_streaming} ? 5 : 0);
}
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Special-case executor expression steps for common combinations
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node