Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm2-6cz2L4=DMab7Zv2USLDpwcXzSWzxZY8kNyK2ommc7Q@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Logical Replication of sequences
List pgsql-hackers
On Wed, 12 Jun 2024 at 17:09, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Jun 12, 2024 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Amit and I engaged in an offline discussion regarding the design and
> > > > contemplated that it could be like below:
> > >
> > > If I understand correctly, does this require the sequences to already
> > > exist on the subscribing node before creating the subscription, or
> > > will it also copy any non-existing sequences?
> >
> > Sequences must exist in the subscriber; we'll synchronize only their
> > values. Any sequences that are not present in the subscriber will
> > trigger an error.
>
> Okay, that makes sense.
>
> >
> > > > 3) Refreshing the sequence can be achieved through the existing
> > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > > here).
> > > > The subscriber identifies stale sequences, meaning sequences present
> > > > in pg_subscription_rel but absent from the publication, and removes
> > > > them from the pg_subscription_rel system table. The subscriber also
> > > > checks for newly added sequences in the publisher and synchronizes
> > > > their values from the publisher using the steps outlined in the
> > > > subscription creation process. It's worth noting that previously
> > > > synchronized sequences won't be synchronized again; the sequence sync
> > > > will occur solely for the newly added sequences.
> > >
> > > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > > The subscriber will remove stale sequences and add newly added
> > > > sequences from the publisher. Following this, it will re-synchronize
> > > > the sequence values for all sequences in the updated list from the
> > > > publisher, following the steps outlined in the subscription creation
> > > > process.
> > >
> > > Okay, this answers my first question: we will remove the sequences
> > > that are removed from the publisher and add the new sequences. I don't
> > > see any problem with this, but doesn't it seem like we are effectively
> > > doing DDL replication only for sequences without having a
> > > comprehensive plan for overall DDL replication?
> >
> > What I intended to convey is that we'll eliminate the sequences from
> > pg_subscription_rel. We won't facilitate the DDL replication of
> > sequences; instead, we anticipate users to create the sequences
> > themselves.
>
> hmm okay.
>
> > > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > > value from the publisher, along with the page LSN. Incorporate
> > > > SetSequence function, which will procure a new relfilenode for the
> > > > sequence and set the new relfilenode with the specified value. This
> > > > will facilitate rollback in case of any failures.
> > >
> > > I do not understand this point, you mean whenever we are fetching the
> > > sequence value from the publisher we need to create a new relfilenode
> > > on the subscriber?  Why not just update the catalog tuple is
> > > sufficient?  Or this is for handling the ALTER SEQUENCE case?
> >
> > Sequences operate distinctively from tables. Alterations to sequences
> > reflect instantly in another session, even before committing the
> > transaction. To ensure the synchronization of sequence value and state
> > updates in pg_subscription_rel, we assign it a new relfilenode. This
> > strategy ensures that any potential errors allow for the rollback of
> > both the sequence state in pg_subscription_rel and the sequence values
> > simultaneously.
>
> So, you're saying that when we synchronize the sequence values on the
> subscriber side, we will create a new relfilenode to allow reverting
> to the old state of the sequence in case of an error or transaction
> rollback? But why would we want to do that? Generally, even if you
> call nextval() on a sequence and then roll back the transaction, the
> sequence value doesn't revert to the old value. So, what specific
> problem on the subscriber side are we trying to avoid by operating on
> a new relfilenode?

Let's consider a situation where we have two sequences: seq1 with a
value of 100 and seq2 with a value of 200. Now, let's say seq1 is
synced and updated to 100, then we attempt to synchronize seq2,
there's a failure due to the sequence not existing or encountering
some other issue. In this scenario, we don't want to halt operations
where seq1 is synchronized, but the sequence state for sequence isn't
changed to "ready" in pg_subscription_rel.
Updating the sequence data directly reflects the sequence change
immediately. However, if we assign a new relfile node for the sequence
and update the sequence value for the new relfile node, until the
transaction is committed, other concurrent users will still be
utilizing the old relfile node for the sequence, and only the old data
will be visible. Once all sequences are synchronized, and the sequence
state is updated in pg_subscription_rel, the transaction will either
be committed or aborted. If committed, users will be able to observe
the new sequence values because the sequences will be updated with the
new relfile node containing the updated sequence value.

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Remove dependence on integer wrapping
Next
From: Dilip Kumar
Date:
Subject: Re: Conflict Detection and Resolution