Re: Logical Replication of sequences - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Logical Replication of sequences
Date
Msg-id CAA4eK1Kr1Jd+5NFS6gh5Pe2wQkSi2i01XdxsrJdy3YYNAZyFPg@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Logical Replication of sequences
Re: Logical Replication of sequences
List pgsql-hackers
On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Yeah, starting with a single worker sounds good for now. Do you think
> > we should sync all the sequences in a single transaction or have some
> > threshold value above which a different transaction would be required
> > or maybe a different sequence sync worker altogether? Now, having
> > multiple sequence-sync workers requires some synchronization so that
> > only a single worker is allocated for one sequence.
> >
> > The simplest thing is to use a single sequence sync worker that syncs
> > all sequences in one transaction but with a large number of sequences,
> > it could be inefficient. OTOH, I am not sure if it would be a problem
> > in reality.
>
> I think that we can start with using a single worker and one
> transaction, and measure the performance with a large number of
> sequences.
>

Fair enough. However, this raises the question Dilip and Vignesh are
discussing whether we need a new relfilenode for sequence update even
during initial sync? As per my understanding, the idea is that similar
to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
will create the new sequence entries in pg_subscription_rel with the
state as 'i'. Then the sequence-sync worker would start a transaction
and one-by-one copy the latest sequence values for each sequence (that
has state as 'i' in pg_subscription_rel) and mark its state as ready
'r' and commit the transaction. Now if there is an error during this
operation it will restart the entire operation. The idea of creating a
new relfilenode is to handle the error so that if there is a rollback,
the sequence state will be rolled back to 'i' and the sequence value
will also be rolled back. The other option could be that we update the
sequence value without a new relfilenode and if the transaction rolled
back then only the sequence's state will be rolled back to 'i'. This
would work with a minor inconsistency that sequence values will be
up-to-date even when the sequence state is 'i' in pg_subscription_rel.
I am not sure if that matters because anyway, they can quickly be
out-of-sync with the publisher again.

Now, say we don't want to maintain the state of sequences for initial
sync at all then after the error how will we detect if there are any
pending sequences to be synced? One possibility is that we maintain a
subscription level flag 'subsequencesync' in 'pg_subscription' to
indicate whether sequences need sync. This flag would indicate whether
to sync all the sequences in pg_susbcription_rel. This would mean that
if there is an error while syncing the sequences we will resync all
the sequences again. This could be acceptable considering the chances
of error during sequence sync are low. The benefit is that both the
REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
idea and sync all sequences without needing a new relfilenode. Users
can always refer 'subsequencesync' flag in 'pg_subscription' to see if
all the sequences are synced after executing the command.

> > > Or yet another idea I came up with is that a tablesync worker will
> > > synchronize both the table and sequences owned by the table. That is,
> > > after the tablesync worker caught up with the apply worker, the
> > > tablesync worker synchronizes sequences associated with the target
> > > table as well. One benefit would be that at the time of initial table
> > > sync being completed, the table and its sequence data are consistent.
>
> Correction; it's not guaranteed that the sequence data and table data
> are consistent even in this case since the tablesync worker could get
> on-disk sequence data that might have already been updated.
>

The benefit of this approach is not clear to me. Our aim is to sync
all sequences before the upgrade, so not sure if this helps because
anyway both table values and corresponding sequences can again be
out-of-sync very quickly.

> >
> > > >
> > > > 3) Refreshing the sequence can be achieved through the existing
> > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > > here).
> > > > The subscriber identifies stale sequences, meaning sequences present
> > > > in pg_subscription_rel but absent from the publication, and removes
> > > > them from the pg_subscription_rel system table. The subscriber also
> > > > checks for newly added sequences in the publisher and synchronizes
> > > > their values from the publisher using the steps outlined in the
> > > > subscription creation process. It's worth noting that previously
> > > > synchronized sequences won't be synchronized again; the sequence sync
> > > > will occur solely for the newly added sequences.
> > > >
> > > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > > The subscriber will remove stale sequences and add newly added
> > > > sequences from the publisher. Following this, it will re-synchronize
> > > > the sequence values for all sequences in the updated list from the
> > > > publisher, following the steps outlined in the subscription creation
> > > > process.
> > >
> > > The difference between 3) and 4) is whether or not to re-synchronize
> > > the previously synchronized sequences. Do we really want to introduce
> > > a new command for 4)? I felt that we can invent an option say
> > > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4)
> > > case.
> > >
> >
> > Yeah, that is also an option but it could confuse along with copy_data
> > option. Say the user has selected copy_data = false but
> > copy_all_sequences = true then the first option indicates to *not*
> > copy the data of table and sequences and the second option indicates
> > to copy the sequences data which sounds contradictory. The other idea
> > is to have an option copy_existing_sequences (which indicates to copy
> > existing sequence values) but that also has somewhat the same drawback
> > as copy_all_sequences but to a lesser degree.
>
> Good point. And I understood that the REFRESH PUBLICATION SEQUENCES
> command would be helpful when users want to synchronize sequences
> between two nodes before upgrading.
>

Right.

> >
> > > >
> > > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > > value from the publisher, along with the page LSN. Incorporate
> > > > SetSequence function, which will procure a new relfilenode for the
> > > > sequence and set the new relfilenode with the specified value. This
> > > > will facilitate rollback in case of any failures.
> > >
> > > Does it mean that we create a new relfilenode for every update of the value?
> > >
> >
> > We need it for initial sync so that if there is an error both the
> > sequence state in pg_subscription_rel and sequence values can be
> > rolled back together.
>
> Agreed.
>
> > However, it is unclear whether we need to create
> > a new relfilenode while copying existing sequences (say during ALTER
> > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we
> > decide)? Probably the answer lies in how we want to implement this
> > command. If we want to copy all sequence values during the command
> > itself then it is probably okay but if we want to handover this task
> > to the sequence-sync worker then we need some state management and a
> > new relfilenode so that on error both state and sequence values are
> > rolled back.
>
> What state transition of pg_subscription_rel entries for sequences do
> we need while copying sequences values? For example, we insert an
> entry with 'init' state at CREATE SUBSCRIPTION and then the
> sequence-sync worker updates to 'ready' and copies the sequence data.
> And at REFRESH PUBLICATION SEQUENCES, we update the state back to
> 'init' again so that the sequence-sync worker can process it? Given
> REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it
> might be acceptable to transactionally update sequence values.
>

Do you mean that sync the sequences during the REFRESH PUBLICATION
SEQUENCES command itself? If so, there is an argument that we can do
the same during CREATE SUBSCRIPTION. It would be beneficial to keep
the method to sync the sequences same for both the CREATE and REFRESH
commands. I have speculated on one idea above and would be happy to
see your thoughts.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: Show WAL write and fsync stats in pg_stat_io
Next
From: Daniel Gustafsson
Date:
Subject: Re: RFC: adding pytest as a supported test framework