Re: Documentation to upgrade logical replication cluster - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Documentation to upgrade logical replication cluster
Date
Msg-id CALj2ACWm6_6nesFkNvpxMKfwED5v6HJ1z8becjY4+hrRbbjm-A@mail.gmail.com
Whole thread Raw
In response to Re: Documentation to upgrade logical replication cluster  (vignesh C <vignesh21@gmail.com>)
Responses Re: Documentation to upgrade logical replication cluster
List pgsql-hackers
On Mon, Jan 29, 2024 at 10:10 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v5 version patch has the changes
> for the same.

Thanks for working on this. Here are some comments on the v5 patch:

1.
+  <para>
+   Migration of logical replication clusters is possible only when all the
+   members of the old logical replication clusters are version 17.0 or later.

Perhaps define what logical replication cluster is either in glossary
or within a parenthesis next to the first use in the docs? This will
help developers understand it better and will not confuse them with
postgres cluster. I see it being used for the first time in code
comments 9a17be1e2, but this patch uses it for the first time in the
docs.

2.
+   Before reading this section, refer <xref linkend="pgupgrade"/> page for
+   more details about pg_upgrade.
+  </para>

This looks extraneous, we can just link to pg_upgrade on the first use
of pg_upgrade, change the following

+   <para>
+    <application>pg_upgrade</application> attempts to migrate logical
+    slots. This helps avoid the need for manually defining the same

to

+   <para>
+    <xref linkend="pgupgrade"/> attempts to migrate logical
+    slots. This helps avoid the need for manually defining the same

3.
+     transactional, the user is advised to take backups. Backups can be taken
+     as described in <xref linkend="backup-base-backup"/>.
+    </para>

How about simplifying the above to "the user is advised to take
backups as described in <xref linkend="backup-base-backup"/>" instead
of two statements?

4.
  subscription is temporarily disabled, by executing
+    <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION
... DISABLE</command></link>.
+    Re-enable the subscription after the upgrade.
+   </para>

Is it to avoid repeated failures of logical replication apply workers
on the subscribers? Isn't it good to say why subscription needs to be
disabled?

5.
+   <para>
+    There are some prerequisites for <application>pg_upgrade</application> to
+    be able to upgrade the logical slots. If these are not met an error
+    will be reported.
+   </para>

I think it's better to be "Following are prerequisites for
<application>pg_upgrade</application> to.."?

6.
+    <listitem>
+     <para>
+      The old cluster has replicated all the transactions and logical decoding
+      messages to subscribers.
+     </para>

I think it's better to be "The old cluster must have replicated all
the transactions and ...."?

7.
+     <para>
+      The new cluster must not have permanent logical slots, i.e.,
+      there must be no slots where
+      <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>temporary</structfield>
+      is <literal>false</literal>.

I think we better specify a full SQL query as opposed to just
specifying one output column and the view name.

<para>
 The new cluster must not have permanent logical slots, i.e., a query like:
<programlisting>
SELECT count(*) FROM pg_replication_slots WHERE slot_type = 'logical'
AND temporary IS false;
</programlisting>
    must return 0.
   </para>

8.
+       If the old cluster is prior to 17.0, then no slots on the primary are
+       copied to the new standby, so all the slots must be recreated manually.
+       If the old cluster is 17.0 or later, then only logical slots on the

I think it's better to say "version 17.0" instead of just "17.0".

9.
+       primary are copied to the new standby, but other slots on the
old standby

"but other slots on the old standby" - is it slots on the old standby
or old cluster?

I think it's the other way around: the old cluster needs to be
replaced with the old standby in the newly added paragraph.

10.
Change
+       primary are copied to the new standby, but other slots on the
old standby
+       are not copied so must be recreated manually.

to

+       primary are copied to the new standby, but other slots on the
old standby
+       are not copied, so must be recreated manually.

11.
+   <note>
+    <para>
+     The logical replication restrictions apply to logical replication cluster
+     upgrades also. See <xref linkend="logical-replication-restrictions"/> for
+     the details of logical replication restrictions.
+    </para>

How about just say "See <xref
linkend="logical-replication-restrictions"/> for details." instead of
using logical replication restrictions more than once in the same
para?

12.
+    <para>
+     The prerequisites of publisher upgrade apply to logical replication
+     cluster upgrades also. See <xref linkend="prepare-publisher-upgrades"/>
+     for the details of publisher upgrade prerequisites.

How about just say "See <xref linkend="prepare-publisher-upgrades"/>
for details." instead of using publisher upgrade prerequisites more
than once in the same para?

13.
+    <para>
+     The prerequisites of subscriber upgrade apply to logical replication
+     cluster upgrades also. See <xref linkend="prepare-subscriber-upgrades"/>
+     for the details of subscriber upgrade prerequisites.
+    </para>

How about just say "See <xref linkend="prepare-subscriber-upgrades"/>
for details." instead of using subscriber upgrade prerequisites more
than once in the same para?

14.
+     Upgrading logical replication cluster requires multiple steps to be
+     performed on various nodes. Because not all operations are

Per comment #1, defining logical replication clusters and nodes helps
clearly distinguish. For instance, one can get confused with the
various terms in hand - postgres cluster, logical replication cluster,
node etc.

15.
+      two subscriptions <literal>sub1_node1_node2</literal> and
+      <literal>sub2_node1_node2</literal> which are subscribing the changes
+      from <literal>node1</literal>.

Why confluse with subsription names by including node1 and node2 in
it? We are not creating subscriptions from node1 to node2, are we? I'd
recommend using simplified names like mysub1, mysub2 like elsewhere in
the documentation.

16.
+      Let's say publisher is in <literal>node1</literal> and subscriber is
+      in <literal>node2</literal>.

How about saying "publisher is in a database cluster named
<literal>node1</literal> and subscriber is in database cluster named
<literal>node2</literal>"? I think using this terminology helps.

17.
+     refer to <xref linkend="logical-replication-upgrade"/> for details.
+    </para>
+   </note>

IMHO, it could have been better if steps to upgrade the logical
replication cluster is specified in pgupgrade.sgml as opposed to
logical-replication.sgml. Because, upgrading logical replication
cluster is a sub-section for pg_upgrade.

18.
+   <para>
+    The steps to upgrade the following logical replication clusters are
+    detailed below:
+    <itemizedlist>
+     <listitem>
+      <para>
+       Follow the steps specified in

I think we can talk about what advantages upgrading logical
replication clusters brings in. We can say that the pg_upgrade makes
it possible 1) to re-use the logical replication slots post-upgrade,
2) to re-use the subscribers i.e. now it's not required to re-create
all the logical subscribers after the upgrade, so no initial table
sync, no creation of new clusters for subscribers etc.

19. I think we can talk about the possible gotchas i.e. the things
that can go wrong while performing any of the prescribed steps. What
happens if the slot the pg_upgrade interrupts after it upgraded a few
of the replication slots or if some of the prerequisites are not met
etc.?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: MERGE ... WHEN NOT MATCHED BY SOURCE
Next
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby