Re: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id CAA4eK1Jy_5JcRt5acCx4WVoU=UYDTHx7dRXJ8+=FZ5A-AqMUGw@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Peter Smith <smithpb2250@gmail.com>)
Responses RE: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
On Wed, Aug 30, 2023 at 7:55 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some minor review comments for patch v28-0002
>
> ======
> src/sgml/ref/pgupgrade.sgml
>
> 1.
> -       with the primary.)  Replication slots are not copied and must
> -       be recreated.
> +       with the primary.)  Replication slots on old standby are not copied.
> +       Only logical slots on the primary are migrated to the new standby,
> +       and other slots must be recreated.
>        </para>
>
> /on old standby/on the old standby/
>

Fixed.

> ======
> src/bin/pg_upgrade/info.c
>
> 2. get_old_cluster_logical_slot_infos
>
> +void
> +get_old_cluster_logical_slot_infos(void)
> +{
> + int dbnum;
> +
> + /* Logical slots can be migrated since PG17. */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
> + return;
> +
> + pg_log(PG_VERBOSE, "\nsource databases:");
> +
> + for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
> + {
> + DbInfo    *pDbInfo = &old_cluster.dbarr.dbs[dbnum];
> +
> + get_old_cluster_logical_slot_infos_per_db(pDbInfo);
> +
> + if (log_opts.verbose)
> + {
> + pg_log(PG_VERBOSE, "Database: \"%s\"", pDbInfo->db_name);
> + print_slot_infos(&pDbInfo->slot_arr);
> + }
> + }
> +}
>
> It might be worth putting an Assert before calling the
> get_old_cluster_logical_slot_infos_per_db(...) just as a sanity check:
> Assert(pDbInfo->slot_arr.nslots == 0);
>
> This also helps to better document the "Note" of the
> count_old_cluster_logical_slots() function comment.
>

I have changed the comments atop count_old_cluster_logical_slots() and
also I don't see the need for this Assert.

> ~~~
>
> 3. count_old_cluster_logical_slots
>
> +/*
> + * count_old_cluster_logical_slots()
> + *
> + * Sum up and return the number of logical replication slots for all databases.
> + *
> + * Note: this function always returns 0 if the old_cluster is PG16 and prior
> + * because old_cluster.dbarr.dbs[dbnum].slot_arr is set only for PG17 and
> + * later.
> + */
> +int
> +count_old_cluster_logical_slots(void)
>
> Maybe that "Note" should be expanded a bit to say who does this:
>
> SUGGESTION
>
> Note: This function always returns 0 if the old_cluster is PG16 and
> prior because old_cluster.dbarr.dbs[dbnum].slot_arr is set only for
> PG17 and later. See where get_old_cluster_logical_slot_infos_per_db()
> is called.
>

Changed, but written differently because saying in terms of variable
name doesn't sound good to me.

> ======
> src/bin/pg_upgrade/pg_upgrade.c
>
> 4.
> + /*
> + * Logical replication slot upgrade only supported for old_cluster >=
> + * PG17.
> + *
> + * Note: This must be done after doing the pg_resetwal command because
> + * pg_resetwal would remove required WALs.
> + */
> + if (count_old_cluster_logical_slots())
> + {
> + start_postmaster(&new_cluster, true);
> + create_logical_replication_slots();
> + stop_postmaster(false);
> + }
> +
>
> 4a.
> I felt this comment needs a bit more detail otherwise you can't tell
> how the >= PG17 version check works.
>
> 4b.
> /slot upgrade only supported/slot upgrade is only supported/
>
> ~
>
> SUGGESTION
>
> Logical replication slot upgrade is only supported for old_cluster >=
> PG17. An explicit version check is not necessary here because function
> count_old_cluster_logical_slots() will always return 0 for old_cluster
> <= PG16.
>

I don't see the need to explain anything about version check here, so
removed that part of the comment.

Apart from this, I have addressed some of the comments raised by you
for the 0003 patch. Please find the diff patch attached. I think we
should combine 0002 and 0003 patches.

I have another comment on the patch:
+ /* Check there are no logical replication slots with a 'lost' state. */
+ res = executeQueryOrDie(conn,
+ "SELECT slot_name FROM pg_catalog.pg_replication_slots "
+ "WHERE wal_status = 'lost' AND "
+ "temporary IS FALSE;");

In this place, shouldn't we explicitly check for slot_type as logical?
I think we should consistently check for slot_type in all the queries
used in this patch.

--
With Regards,
Amit Kapila.

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: cataloguing NOT NULL constraints
Next
From: Amit Kapila
Date:
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node