Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAA4eK1KNRpQVxALa-h17XNnF2y5Ew=Ga=gTVZpr+CJa-o+xg-A@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Fri, Feb 2, 2024 at 6:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Feb 1, 2024 at 12:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > > BTW I've tested the following switch/fail-back scenario but it seems
> > > not to work fine. Am I missing something?
> > >
> > > Setup:
> > > node1 is the primary, node2 is the physical standby for node1, and
> > > node3 is the subscriber connecting to node1.
> > >
> > > Steps:
> > > 1. [node1]: create a table and a publication for the table.
> > > 2. [node2]: set enable_syncslot = on and start (to receive WALs from node1).
> > > 3. [node3]: create a subscription with failover = true for the publication.
> > > 4. [node2]: promote to the new standby.
> > > 5. [node3]: alter subscription to connect the new primary, node2.
> > > 6. [node1]: stop, set enable_syncslot = on (and other required
> > > parameters), then start as a new standby.
> > >
> > > Then I got the error "exiting from slot synchronization because same
> > > name slot "test_sub" already exists on the standby".
> > >
> > > The logical replication slot that was created on the old primary
> > > (node1) has been synchronized to the old standby (node2). Therefore on
> > > node2, the slot's "synced" field is true. However, once node1 starts
> > > as the new standby with slot synchronization, the slotsync worker
> > > cannot synchronize the slot because the slot's "synced" field on the
> > > primary is false.
> > >
> >
> > Yeah, we avoided doing anything in this case because the user could
> > have manually created another slot with the same name on standby.
> > Unlike WAL slots can be modified on standby as we allow decoding on
> > standby, so we can't allow to overwrite the existing slots. We won't
> > be able to distinguish whether the existing slot was a slot that the
> > user wants to sync with primary or a slot created on standby to
> > perform decoding. I think in this case user first needs to drop the
> > slot on new standby.
>
> Yes, but if we do a switch-back further (i.e. in above case, node1
> backs to the primary again and node becomes the standby again), the
> user doesn't need to remove failover slots since they are already
> marked as "synced".

But, I think in this case node-2's timeline will be ahead of node-1,
so will we be able to make node-2 follow node-1 again without any
additional steps? One thing is not clear to me after promotion the
timeline changes in WAL, so the locations in slots will be as per new
timelines, after that will it be safe to sync slots from the new
primary to old-primary?

In general, I think after failover, we recommend running pg_rewind if
the old primary has to follow the new primary to account for
divergence in WAL. So, not sure we can safely start syncing slots in
old-primary from new-primary, consider that in the new primary, the
same name slot may have dropped/re-created multiple times. We can
probably reset all the fields of the existing slot the first time
syncing for an existing slot or do something like that but I think it
would be better to just re-create the slot.

>
 I wonder if we could do something automatically to
> reduce the user's operation.

One possibility is that we forcefully drop/re-create the slot or
directly overwrite the slot contents but that would probably be better
done via some GUC or slot-level parameter. I feel we should leave this
for another day, for the first version, we can document that an error
will occur if the same name slots on standby exist, so users need to
ensure that there shouldn't be an existing same name slots on standby
before sync.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: POC: GROUP BY optimization
Next
From: Shubham Khanna
Date:
Subject: Re: Improve eviction algorithm in ReorderBuffer