Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm0kGfCS5OY7x=JzzAMbOcwh+ykpGtV8JOfN4gG131gAUw@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Fri, 31 Oct 2025 at 11:26, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Vignesh,
>
> For later.... here are some review comments for the documentation
> patch v20251030-0002
>
> ======
> doc/src/sgml/config.sgml
>
> wal_retrieve_retry_interval:
>
> 1.
>         <para>
> -        In logical replication, this parameter also limits how often a failing
> -        replication apply worker or table synchronization worker will be
> -        respawned.
> +        In logical replication, this parameter also limits how quickly a
> +        failing replication apply worker, table synchronization worker, or
> +        sequence synchronization worker will be respawned.
>         </para>
>
> I think you can simplify that.
>
> SUGGESTION
> In logical replication, this parameter also limits how quickly a
> failing replication apply worker, or table/sequence synchronization
> worker will be respawned.

Modified

> ~~~
>
> max_logical_replication_workers:
>
> 2.
>         <para>
>          Specifies maximum number of logical replication workers. This includes
> -        leader apply workers, parallel apply workers, and table synchronization
> -        workers.
> +        leader apply workers, parallel apply workers, table synchronization
> +        workers and a sequence synchronization worker.
>         </para>
>
> I think you can simplify that.
>
> SUGGESTION
> This includes leader apply workers, parallel apply workers, and
> table/sequence synchronization workers.

Modified

> ~~~
>
> max_sync_workers_per_subscription:
>
> 3.
>         <para>
>          Maximum number of synchronization workers per subscription. This
>          parameter controls the amount of parallelism of the initial data copy
> -        during the subscription initialization or when new tables are added.
> +        during the subscription initialization or when new tables or sequences
> +        are added.
>         </para>
>
> But, there is no parallelism at all for sequence copies, because there
> is only one sequencesync worker (as the following docs paragraph
> says), so maybe we do not need this docs change.
>
> NOTE -- see the comment #12 below, and maybe use wording like that.

Modified similarly

> ======
> doc/src/sgml/logical-replication.sgml
>
> Section 29.1 Publication:
>
> 4.
>     Publications may currently only contain tables or sequences. Objects must be
>     added explicitly, except when a publication is created using
>     <literal>FOR TABLES IN SCHEMA</literal>, <literal>FOR ALL TABLES</literal>,
> -   or <literal>FOR ALL SEQUENCES</literal>.
> +   or <literal>FOR ALL SEQUENCES</literal>. Unlike tables, the state of
> +   sequences can be synchronized at any time. For more information, see
> +   <xref linkend="logical-replication-sequences"/>.
>
> Not sure about the wording "the state of". Maybe it can be simplified?
>
> SUGGESTION
> Unlike tables, sequences can be synchronized at any time.

Modified

> ~~~
>
> 5.
> +    <listitem>
> +     <para>
> +      use <link linkend="sql-altersubscription-params-refresh-sequences">
> +      <command>ALTER SUBSCRIPTION ... REFRESH SEQUENCES</command></link>
> +      to re-synchronize all sequences.
> +     </para>
> +    </listitem>
>
> AFAIK it's not going to get any newly added sequences so it is not
> really "all sequences" so this seems misleading. I thought it should
> be like below.
>
> SUGGESTION
> use ALTER SUBSCRIPTION ... REFRESH SEQUENCES to re-synchronize all
> sequences currently known to the subscription.

Modified

> ~~~
>
> Section 29.7.1. Sequence Definition Mismatches:
>
> 6.
> +   <para>
> +    During sequence synchronization, the sequence definitions of the publisher
> +    and the subscriber are compared. An error is logged listing all differing
> +    sequences before the process exits. The apply worker detects this failure
> +    and repeatedly respawns the sequence synchronization worker to retry until
> +    all differences are resolved. See also
> +    <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_retrieve_retry_interval</varname></link>.
> +   </para>
>
> It seems a bit misleading. e.g. AFAIK the "The apply worker detects
> this failure" is not true. IIUC, the apply worker simply finds some
> sequences that still have INIT state, so really it has no knowledge of
> failure at all, right?
>
> Consider rewording this part.
>
> SUGGESTION
> The sequence synchronization worker validates that sequence
> definitions match between publisher and subscriber. If mismatches
> exist, the worker logs an error identifying them and exits. The apply
> worker continues respawning the sequence synchronization worker until
> synchronization succeeds.

Modified

> ~~~
>
> Section 29.7.2. Refreshing Stale Sequences:
>
> 7.
> +   <para>
> +    Subscriber side sequence values may frequently become out of sync due to
> +    updates on the publisher.
> +   </para>
> +   <para>
> +    To verify, compare the sequence values between the publisher and
> +    subscriber, and if necessary, execute
> +    <link linkend="sql-altersubscription-params-refresh-sequences">
> +    <command>ALTER SUBSCRIPTION ... REFRESH SEQUENCES</command></link>.
> +   </para>
>
> I didn't see why the wording "To verify" was needed here. Below is a
> slightly simpler alternative for these paragraphs.
>
> SUGGESTION
> Subscriber sequence values drift out of sync as the publisher advances
> them.  Compare values between publisher and subscriber, then run ALTER
> SUBSCRIPTION ... REFRESH SEQUENCES to resynchronize if necessary.

Modified

> ~~~
>
> Section 29.7.3. Examples.
>
> 8. GENERAL. Prompts in examples
>
> I think using prompts like "test_pub#" in the examples is frowned upon
> because it makes cutting directly from the examples more difficult.
> Similarly, the result of the commands is not shown.
>
> See other PG18 logical replication examples for why the current style
> is.... e.g. more like this:
> /* pub # */ CREATE TABLE t1(a int, b int, c text, PRIMARY KEY(a,c));
> /* pub # */ CREATE TABLE t2(d int, e int, f int, PRIMARY KEY(d));

Modified

> ~~~
>
> 9.
> +    Re-synchronize all the sequences on the subscriber using
> +    <link linkend="sql-altersubscription-params-refresh-sequences">
> +    <command>ALTER SUBSCRIPTION ... REFRESH SEQUENCES</command></link>.
>
> SUGGESTION
> Re-synchronize all sequences known to the subscriber using...

Modified

> ~~~
>
> Section 29.9. Restrictions #
>
> 10.
> +     then this should typically not be a problem.  If, however, some kind of
> +     switchover or failover to the subscriber database is intended, then the
> +     sequences would need to be updated to the latest values, either by
> +     executing <link linkend="sql-altersubscription-params-refresh-sequences">
> +     <command>ALTER SUBSCRIPTION ... REFRESH SEQUENCES</command></link>
> +     or by copying the current data from the publisher (perhaps using
> +     <command>pg_dump</command>) or by determining a sufficiently high value
> +     from the tables themselves.
>
> IIUC the "ALTER SUBSCRIPTION ... REFRESH SEQUENCES" is only going to
> resync the sequences that the subscriber already knew about. So, if
> you really wanted to get the latest of everything won't the user need
> to execute double-commands just in case there are some new sequences
> at the publisher?
>
> e.g.
> First, ALTER SUBSCRIPTION REFRESH PUBLICATION
> Then, ALTER SUBSCRIPTION REFRESH SEQUENCES

I felt we don't support DDL, so new one's created should be copied
using pg_dump. I felt existing is ok.

> ~~~
>
> Section 29.13.2. Subscribers #
>
> 11.
> -    workers), plus some reserve for the table synchronization workers and
> -    parallel apply workers.
> +    workers), plus some reserve for the parallel apply workers, table
> synchronization workers, and a sequence
> +    synchronization worker.
>
> I think this can be worded similar to the config.sgml
>
> SUGGESTION
> ... plus some reserve for the parallel apply workers, and
> table/sequence synchronization workers.

Modified

> ~~
>
> 12.
>     <para>
>      <link
linkend="guc-max-sync-workers-per-subscription"><varname>max_sync_workers_per_subscription</varname></link>
> -     controls the amount of parallelism of the initial data copy during the
> -     subscription initialization or when new tables are added.
> +     controls how many tables can be synchronized in parallel during
> +     subscription initialization or when new tables are added. One additional
> +     worker is also needed for sequence synchronization.
>     </para>
>
> Oh, perhaps this is the wording that should have been used in
> config.sgml (review comment #3) to avoid implying about sequencesync
> workers helping with parallelism.

Used the sequence synchronization doc similar to here in #3

> ======
> doc/src/sgml/monitoring.sgml
>
> 13.
>        <para>
>         Type of the subscription worker process.  Possible types are
> -       <literal>apply</literal>, <literal>parallel apply</literal>, and
> -       <literal>table synchronization</literal>.
> +       <literal>apply</literal>, <literal>parallel apply</literal>,
> +       <literal>table synchronization</literal>, and
> +       <literal>sequence synchronization</literal>.
>        </para></entry>
>
> This docs fragment probably belongs with the pg_stats patch, not here.

This is ok here as we display this for running process, the other
stats patch is mainly for errors.

> ======
> doc/src/sgml/ref/create_subscription.sgml
>
> 14.
>            (see <xref linkend="sql-createtype"/> for more about send/receive
> -          functions).
> +          functions). This parameter is not applicable for sequences.
>           </para>
>
> In many places for this page the patch says "This parameter is not
> applicable for sequences."
>
> IMO that is ambiguous. It is not clear if the parameter is silently
> ignored, if it will give an error, or what?
>
> Maybe you can clarify by saying "is ignored" or "has no effect",
> instead of "is not applicable"

Modified

The attached v20251102 version patch has the changes for the same.

Regards,
Vignesh

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Add a guc parameter to control limit clause adjust path cost.
Next
From: Sami Imseih
Date:
Subject: Re: Improve LWLock tranche name visibility across backends