Home > mailing lists

Re: Logical Replication of sequences - Mailing list pgsql-hackers

From	shveta malik
Subject	Re: Logical Replication of sequences
Date	October 27, 2025 11:29:41
Msg-id	CAJpy0uB7JcrDntEm+nvQ8k74dMzYcDEQZBS70mOcY0N3UcczFg@mail.gmail.com Whole thread
In response to	RE: Logical Replication of sequences ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List	pgsql-hackers

Tree view

On Mon, Oct 27, 2025 at 8:23 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, October 24, 2025 11:22 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, 23 Oct 2025 at 16:47, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 23, 2025 at 11:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > The attached patch has the changes for the same.
> > > >
> > >
> > > I have pushed 0001 and the following are comments on 0002.
> >
> > The attached v20251024 version patch has the changes for the same.
> > The comments from [1] have also been addressed in this version.
>
> Thanks for updating the patch.
>
> I was reviewing 0003 and have some thoughts for simplifying the codes related to
> sequence state invalidations and hash tables:
>
> 1.  I'm considering whether we could lock sequences at the start and maintain
>     these locks until the copy process finishes, allowing us to remove
>     invalidation codes.
>
>    I understand that the current process is:
>
>    1. start a transaction to fetch namespace/seqname for all the sequences in
>       the pg_subscription_rel
>    2. start multiple transation and handle a batch of in each transaction
>
>    So if there are sequence is altered between step 1 and 2, then we need to
>    skip the renamed or dropped sequences in step 2 and invalidates the hash
>    entry which looks inelegant.
>
>    To improve this, my proposal is to postpone the namespace/seqname fetch logic
>    until the second step. Initially, we would fetch just the sequence OIDs.
>    Then, in step 2, we would fetch the namespace/seqname after locking the
>    sequence. This approach ensures that any concurrent RENAME operations between
>    steps are irrelevant, as we will use the latest sequence names to query the
>    publisher, preventing any RENAME during step 2. This logic is also consistent
>    with tablesync process where we lock the table first and get nspname/relname
>    after that.
>
> 2. We currently use a hash table to map remote sequence information to local
>    sequence data. I'm exploring the possibility of using a List instead. By
>    passing the sequence's index in the List to the query:
>
>    The idea is to pass the index of the sequence in the List to the query like:
>
>    "FROM ( VALUES %s ) AS s (schname, seqname, seqidx)"
>
>    Upon receiving the results, we can directly map remote sequences to local
>    ones using:: "list_nth(seqinfos, seqidx);"
>
> Here is a patch atop of 0003 that implements above ideas. Please take a
> look at this and see if it makes the code look better.
>

I like the overall idea here. With this approach, since we first fetch
the relids and then retrieve the sequence names later (after taking
the exclusive lock), our remote query will always use the latest
names, whether they’re the original or altered ones, it doesn’t
matter. The sequence name itself can’t change during this step, so
we’re safe on that front. As a result, the race conditions I mentioned
in [1] and [2] are no longer applicable. I’ll still go through the
patch in more detail to verify and review it.

[1]: https://www.postgresql.org/message-id/CAJpy0uC-Jx2L6tOTnDQ_Zwz99X3HQDik6tG%3D%2B1a71SxZFiy12w%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAJpy0uAQ43WjvuBi9F_hOJwsa1veGCJJs0ogH1o_o9AAv0jTfg%40mail.gmail.com

thanks
Shveta

pgsql-hackers by date:

From: jian he
Date: 27 October 2025, 11:20:37
Subject: COPY WHERE clause generated/system column reference

From: shveta malik
Date: 27 October 2025, 11:42:33
Subject: Re: Logical Replication of sequences

Re: Logical Replication of sequences - Mailing list pgsql-hackers

Previous

Next