Re: Logical Replication of sequences - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Logical Replication of sequences
Date
Msg-id CAD21AoD-0WH5EK3ivOrbE==YLQzD=Qm31X4xySPawNevucj6dw@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Logical Replication of sequences
List pgsql-hackers
On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Yeah, starting with a single worker sounds good for now. Do you think
> > > we should sync all the sequences in a single transaction or have some
> > > threshold value above which a different transaction would be required
> > > or maybe a different sequence sync worker altogether? Now, having
> > > multiple sequence-sync workers requires some synchronization so that
> > > only a single worker is allocated for one sequence.
> > >
> > > The simplest thing is to use a single sequence sync worker that syncs
> > > all sequences in one transaction but with a large number of sequences,
> > > it could be inefficient. OTOH, I am not sure if it would be a problem
> > > in reality.
> >
> > I think that we can start with using a single worker and one
> > transaction, and measure the performance with a large number of
> > sequences.
> >
>
> Fair enough. However, this raises the question Dilip and Vignesh are
> discussing whether we need a new relfilenode for sequence update even
> during initial sync? As per my understanding, the idea is that similar
> to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
> will create the new sequence entries in pg_subscription_rel with the
> state as 'i'. Then the sequence-sync worker would start a transaction
> and one-by-one copy the latest sequence values for each sequence (that
> has state as 'i' in pg_subscription_rel) and mark its state as ready
> 'r' and commit the transaction. Now if there is an error during this
> operation it will restart the entire operation. The idea of creating a
> new relfilenode is to handle the error so that if there is a rollback,
> the sequence state will be rolled back to 'i' and the sequence value
> will also be rolled back. The other option could be that we update the
> sequence value without a new relfilenode and if the transaction rolled
> back then only the sequence's state will be rolled back to 'i'. This
> would work with a minor inconsistency that sequence values will be
> up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> I am not sure if that matters because anyway, they can quickly be
> out-of-sync with the publisher again.

I think it would be fine in many cases even if the sequence value is
up-to-date even when the sequence state is 'i' in pg_subscription_rel.
But the case we would like to avoid is where suppose the sequence-sync
worker does both synchronizing sequence values and updating the
sequence states for all sequences in one transaction, and if there is
an error we end up retrying the synchronization for all sequences.

>
> Now, say we don't want to maintain the state of sequences for initial
> sync at all then after the error how will we detect if there are any
> pending sequences to be synced? One possibility is that we maintain a
> subscription level flag 'subsequencesync' in 'pg_subscription' to
> indicate whether sequences need sync. This flag would indicate whether
> to sync all the sequences in pg_susbcription_rel. This would mean that
> if there is an error while syncing the sequences we will resync all
> the sequences again. This could be acceptable considering the chances
> of error during sequence sync are low. The benefit is that both the
> REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> idea and sync all sequences without needing a new relfilenode. Users
> can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> all the sequences are synced after executing the command.

I think that REFRESH PUBLICATION {SEQUENCES} can be executed even
while the sequence-sync worker is synchronizing sequences. In this
case, the worker might not see new sequences added by the concurrent
REFRESH PUBLICATION {SEQUENCES} command since it's already running.
The worker could end up marking the subsequencesync as completed while
not synchronizing these new sequences.

>
> > > > Or yet another idea I came up with is that a tablesync worker will
> > > > synchronize both the table and sequences owned by the table. That is,
> > > > after the tablesync worker caught up with the apply worker, the
> > > > tablesync worker synchronizes sequences associated with the target
> > > > table as well. One benefit would be that at the time of initial table
> > > > sync being completed, the table and its sequence data are consistent.
> >
> > Correction; it's not guaranteed that the sequence data and table data
> > are consistent even in this case since the tablesync worker could get
> > on-disk sequence data that might have already been updated.
> >
>
> The benefit of this approach is not clear to me. Our aim is to sync
> all sequences before the upgrade, so not sure if this helps because
> anyway both table values and corresponding sequences can again be
> out-of-sync very quickly.

Right.

Given that our aim is to sync all sequences before the upgrade, do we
need to synchronize sequences even at CREATE SUBSCRIPTION time? In
cases where there are a large number of sequences, synchronizing
sequences in addition to tables could be overhead and make less sense,
because sequences can again be out-of-sync quickly and typically
CREATE SUBSCRIPTION is not created just before the upgrade.

>
> > >
> > > > >
> > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > > > value from the publisher, along with the page LSN. Incorporate
> > > > > SetSequence function, which will procure a new relfilenode for the
> > > > > sequence and set the new relfilenode with the specified value. This
> > > > > will facilitate rollback in case of any failures.
> > > >
> > > > Does it mean that we create a new relfilenode for every update of the value?
> > > >
> > >
> > > We need it for initial sync so that if there is an error both the
> > > sequence state in pg_subscription_rel and sequence values can be
> > > rolled back together.
> >
> > Agreed.
> >
> > > However, it is unclear whether we need to create
> > > a new relfilenode while copying existing sequences (say during ALTER
> > > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we
> > > decide)? Probably the answer lies in how we want to implement this
> > > command. If we want to copy all sequence values during the command
> > > itself then it is probably okay but if we want to handover this task
> > > to the sequence-sync worker then we need some state management and a
> > > new relfilenode so that on error both state and sequence values are
> > > rolled back.
> >
> > What state transition of pg_subscription_rel entries for sequences do
> > we need while copying sequences values? For example, we insert an
> > entry with 'init' state at CREATE SUBSCRIPTION and then the
> > sequence-sync worker updates to 'ready' and copies the sequence data.
> > And at REFRESH PUBLICATION SEQUENCES, we update the state back to
> > 'init' again so that the sequence-sync worker can process it? Given
> > REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it
> > might be acceptable to transactionally update sequence values.
> >
>
> Do you mean that sync the sequences during the REFRESH PUBLICATION
> SEQUENCES command itself? If so, there is an argument that we can do
> the same during CREATE SUBSCRIPTION. It would be beneficial to keep
> the method to sync the sequences same for both the CREATE and REFRESH
> commands. I have speculated on one idea above and would be happy to
> see your thoughts.

I meant that the REFRESH PUBLICATION SEQUENCES command updates all
sequence states in pg_subscription_rel to 'init' state, and the
sequence-sync worker can do the synchronization work. We use the same
method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION
{SEQUENCES} commands.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: MultiXactMemberFreezeThreshold can make autovacuum *less* aggressive
Next
From: Peter Eisentraut
Date:
Subject: Re: Conflict Detection and Resolution