On Mon, Oct 27, 2025 at 8:23 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, October 24, 2025 11:22 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, 23 Oct 2025 at 16:47, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 23, 2025 at 11:45 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > The attached patch has the changes for the same.
> > > >
> > >
> > > I have pushed 0001 and the following are comments on 0002.
> >
> > The attached v20251024 version patch has the changes for the same.
> > The comments from [1] have also been addressed in this version.
>
> Thanks for updating the patch.
>
> I was reviewing 0003 and have some thoughts for simplifying the codes related to
> sequence state invalidations and hash tables:
>
> 1. I'm considering whether we could lock sequences at the start and maintain
> these locks until the copy process finishes, allowing us to remove
> invalidation codes.
>
> I understand that the current process is:
>
> 1. start a transaction to fetch namespace/seqname for all the sequences in
> the pg_subscription_rel
> 2. start multiple transation and handle a batch of in each transaction
>
> So if there are sequence is altered between step 1 and 2, then we need to
> skip the renamed or dropped sequences in step 2 and invalidates the hash
> entry which looks inelegant.
>
> To improve this, my proposal is to postpone the namespace/seqname fetch logic
> until the second step. Initially, we would fetch just the sequence OIDs.
> Then, in step 2, we would fetch the namespace/seqname after locking the
> sequence. This approach ensures that any concurrent RENAME operations between
> steps are irrelevant, as we will use the latest sequence names to query the
> publisher, preventing any RENAME during step 2.
>
I think this can lead to undetected deadlock for operations across
nodes. Consider the following example: Say on each node, we have an
AlterSequence operation being performed by a concurrent backend in the
form below.
On Node-1:
----------------
Begin
step-1
sequence sync worker: copy_sequences, locked sequence (say seq-1) in
RowExclusive mode;
Begin;
step-2
Alter Sequence seq-1... --step-2 wait on step-1
step-3
Query on pg_get_sequence_data (from Node-2) will wait for Alter
Sequence. --step-3 wait on step-2
On Node-2:
----------------
Begin;
step-1
sequence sync worker: copy_sequences, locked sequence (say seq-1) in
RowExclusive mode;
Begin
step-2
Alter Sequence seq-1 ... -- step-2 wait on step-1
step-3
Query on pg_get_sequence_data (from Node-1) will wait for Alter
Sequence. --step-3 wait on step-2
If the above scenario is possible then the two nodes will create a
deadlock which can't be detected.
--
With Regards,
Amit Kapila.