RE: Logical Replication of sequences - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Logical Replication of sequences
Date
Msg-id TY4PR01MB16907FBD3F6EF850014EB5B7B94F5A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers
On Monday, October 20, 2025 10:53 AM Chao Li <li.evan.chao@gmail.com> wrote:
> 
> > On Oct 17, 2025, at 17:34, Chao Li <li.evan.chao@gmail.com> wrote:
> >
> > I may find some time to review 0002 and 0003 next week.

Thanks for the comments.

> > 
> > 2 - 0001  - pg_subscription.c
> > ```
> > @@ -542,12 +560,21 @@ HasSubscriptionTables(Oid subid)
> > + * get_tables: get relations for tables of the subscription.
> > + *
> > + * get_sequences: get relations for sequences of the subscription.
> > + *
> > + * not_ready:
> > + * If getting tables and not_ready is false, retrieve all tables;
> > + * otherwise, retrieve only tables that have not reached the READY state.
> > + * If getting sequences and not_ready is false, retrieve all sequences;
> > + * otherwise, retrieve only sequences that have not reached the READY state.
> > + *
> > ```
> > 
> > This function comment sounds a bit verbose and repetitive. Suggested
> > revision:
> > ```
> > * get_tables: if true, include tables in the returned list.
> >  * get_sequences: if true, include sequences in the returned list.
> >  * not_ready: if true, include only objects that have not reached the READY
> > state;
> >  *            if false, include all objects of the requested type(s).
> > ```

I have reverted this change according to Amit's suggestion because
existing comments is good enough.

> > 
> > 3 - 0001 - subscriptioncmds.c
> > ```
> > +             * Currently, a replication slot is created for all
> > subscriptions,
> > +             * including those for empty or sequence-only
> > publications. While
> > +             * this is unnecessary, optimizing this behavior would
> > ```
> > 
> > Minor tweaks:
> > * "optimizing this behavior” -> “optimizing it”
> > * “doing anything about it” -> “addressing it"

I have re-rewritten this part as well based on some off-list
discussions.

> > 
> > 4 - 0001 - subscriptioncmds.c
> > ```
> >  * 3) For ALTER SUBSCRIPTION ... REFRESH SEQUENCE statements with
> > "copy_data =
> > * true" and "origin = none":
> > * - Warn the user that sequence data from another origin might have been
> > * copied.
> > ```
> > 
> > “Warn the user” -> “Warn users"

I prefer to keep current word as that's the existing style used atop
of check_publications_origin().

> 
> I just reviewed 0002 and 0003. Got some comments for 0002, and no comment
> for 0003.
> 
> 4 - 0002 - launcher.c
> ```
> -    worker = logicalrep_worker_find(subid, relid, true);
> +    worker = logicalrep_worker_find(subid, relid, WORKERTYPE_APPLY,
> true);
> ```
> 
> Based the comment you added to “logicalrep_worker_find()”, for apply worker,
> relid should be set to InvalidOid. (See comment 2)
> 
> Then if you change this function to only work for WORKERTYPE_APPLY, relid
> should be hard code to InvalidOid, so that “relid” can be removed from
> parameter list of logicalrep_worker_wakeup().

I chose to pass different worker type based on the passed OID in this version.
If we decided to change the function interface to allow only apply worker, then
we can change in later version.

> 6 - 0002 - sequencesync.c
> ```
> +report_error_sequences(StringInfo insuffperm_seqs, StringInfo
> mismatched_seqs)
> +{
> +    StringInfo    combined_error_detail = makeStringInfo();
> +    StringInfo    combined_error_hint = makeStringInfo();
> ```
> 
> By the end of this function, I think we need to destroyStringInfo() these two
> strings.

I think we do not need to free these strings as the worker will stop soon after
executing this function. Similarly, some other string/hash table memory free
in copy_sequences() and LogicalRepSyncSequences() also look unnecessary to me,
and I will consider removing them in later version.

> 
> 7 - 0002 - sequencesync.c
> ```
> +static void
> +append_sequence_name(StringInfo buf, const char *nspname, const char
> *seqname,
> +                     int *count)
> +{
> +    if (buf->len > 0)
> +        appendStringInfoString(buf, ", “);
> ```
> 
> Why this “if” check is needed?

I think we want to append comma only when one sequence name
has already been appended to the buf.

> 
> 10 - 0002 - syncutils.c
> ```
>  /*
> - * Common code to fetch the up-to-date sync state info into the static lists.
> + * Common code to fetch the up-to-date sync state info for tables and
> sequences.
>   *
> - * Returns true if subscription has 1 or more tables, else false.
> + * The pg_subscription_rel catalog is shared by tables and sequences.
> Changes
> + * to either sequences or tables can affect the validity of relation states, so
> + * we identify non-ready tables and non-ready sequences together to ensure
> + * consistency.
>   *
> - * Note: If this function started the transaction (indicated by the parameter)
> - * then it is the caller's responsibility to commit it.
> + * Returns true if subscription has 1 or more tables, else false.
>   */
>  bool
> -FetchRelationStates(bool *started_tx)
> +FetchRelationStates(bool *has_pending_sequences, bool *started_tx)
> ```
> has_pending_sequences is not explained in the function comment.
> 
> 11 - 0002 - syncutils.c
> ```
>     /*
> * has_subtables and has_subsequences is declared as static, since the
> * same value can be used until the system table is invalidated.
> */
> static bool has_subtables = false;
> static bool has_subsequences_non_ready = false;
> ```
> 
> The comment says “has_subsequences”, but the real var name is
> “has_subsequences_non_ready".
> 
> 12 - 0002 - syncutils.c
> ```
>  bool
> -FetchRelationStates(bool *started_tx)
> +FetchRelationStates(bool *has_pending_sequences, bool *started_tx)
> ```
> 
> I searched over the code, this function has 3 callers, none of them want results
> for both table and sequence, so that a caller, for example:
> 
> ```
> bool
> HasSubscriptionTablesCached(void)
> {
> bool started_tx;
> bool has_subrels;
> bool has_pending_sequences;
> 
> /* We need up-to-date subscription tables info here */
> has_subrels = FetchRelationStates(&has_pending_sequences, &started_tx);
> 
> if (started_tx)
> {
> CommitTransactionCommand();
> pgstat_report_stat(true);
> }
> 
> return has_subrels;
> }
> ```
> 
> Looks confused because it defines has_pending_sequences but not using it at
> all.
> 
> So, I think FetchRelationStates() can be refactored to return a result for either
> table or sequence, because on a type parameter.

I will think more for this function and change in next version.

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Logical Replication of sequences
Next
From: Chao Li
Date:
Subject: Re: Use CompactAttribute more often, when possible