RE: Logical Replication of sequences - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Logical Replication of sequences
Date
Msg-id TY4PR01MB169078D0BB792F1EE438A2EE294FCA@TY4PR01MB16907.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (vignesh C <vignesh21@gmail.com>)
Responses Re: Logical Replication of sequences
Re: Logical Replication of sequences
List pgsql-hackers
On Friday, October 24, 2025 11:22 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> On Thu, 23 Oct 2025 at 16:47, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 23, 2025 at 11:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > The attached patch has the changes for the same.
> > >
> >
> > I have pushed 0001 and the following are comments on 0002.
> 
> The attached v20251024 version patch has the changes for the same.
> The comments from [1] have also been addressed in this version.

Thanks for updating the patch.

I was reviewing 0003 and have some thoughts for simplifying the codes related to
sequence state invalidations and hash tables:

1.  I'm considering whether we could lock sequences at the start and maintain
    these locks until the copy process finishes, allowing us to remove
    invalidation codes.

   I understand that the current process is:

   1. start a transaction to fetch namespace/seqname for all the sequences in
      the pg_subscription_rel
   2. start multiple transation and handle a batch of in each transaction

   So if there are sequence is altered between step 1 and 2, then we need to
   skip the renamed or dropped sequences in step 2 and invalidates the hash
   entry which looks inelegant.

   To improve this, my proposal is to postpone the namespace/seqname fetch logic
   until the second step. Initially, we would fetch just the sequence OIDs.
   Then, in step 2, we would fetch the namespace/seqname after locking the
   sequence. This approach ensures that any concurrent RENAME operations between
   steps are irrelevant, as we will use the latest sequence names to query the
   publisher, preventing any RENAME during step 2. This logic is also consistent
   with tablesync process where we lock the table first and get nspname/relname
   after that.

2. We currently use a hash table to map remote sequence information to local
   sequence data. I'm exploring the possibility of using a List instead. By
   passing the sequence's index in the List to the query:

   The idea is to pass the index of the sequence in the List to the query like:

   "FROM ( VALUES %s ) AS s (schname, seqname, seqidx)"

   Upon receiving the results, we can directly map remote sequences to local
   ones using:: "list_nth(seqinfos, seqidx);"

Here is a patch atop of 0003 that implements above ideas. Please take a
look at this and see if it makes the code look better.

Best Regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: Steven Niu
Date:
Subject: remove obsolete comment in AtEOXact_Inval
Next
From: David Rowley
Date:
Subject: Re: Use BumpContext contexts for TupleHashTables' tablecxt