Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAA4eK1JVey4DRfSAEHfF1kgdfY4hbb1LEhCPGexKwYe2Sm1zVQ@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Wed, Oct 4, 2023 at 5:34 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 10/4/23 1:50 PM, shveta malik wrote:
> > On Wed, Oct 4, 2023 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> On Wed, Oct 4, 2023 at 11:55 AM Drouvot, Bertrand
> >> <bertranddrouvot.pg@gmail.com> wrote:
> >>>
> >>> On 10/4/23 6:26 AM, shveta malik wrote:
> >>>> On Wed, Oct 4, 2023 at 5:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>> How about an alternate scheme where we define sync_slot_names on
> >>>>> standby but then store the physical_slot_name in the corresponding
> >>>>> logical slot (ReplicationSlotPersistentData) to be synced? So, the
> >>>>> standby will send the list of 'sync_slot_names' and the primary will
> >>>>> add the physical standby's slot_name in each of the corresponding
> >>>>> sync_slot. Now, if we do this then even after restart, we should be
> >>>>> able to know for which physical slot each logical slot needs to wait.
> >>>>> We can even provide an SQL API to reset the value of
> >>>>> standby_slot_names in logical slots as a way to unblock decoding in
> >>>>> case of emergency (for example, corresponding when physical standby
> >>>>> never comes up).
> >>>>>
> >>>>
> >>>>
> >>>> Looks like a better approach to me. It solves most of the pain points like:
> >>>> 1) Avoids the need of multiple GUCs
> >>>> 2) Primary and standby need not to worry to be in sync if we maintain
> >>>> sync-slot-names GUC on both
> >>
> >> As per my understanding of this approach, we don't want
> >> 'sync-slot-names' to be set on the primary. Do you have a different
> >> understanding?
> >>
> >
> > Same understanding. We do not need it to be set on primary by user. It
> > will be GUC on standby and standby will convey it to primary.
>
> +1, same understanding here.
>

At PGConf NYC, I had a brief discussion on this topic with Andres
where yet another approach to achieve this came up. Have a parameter
like enable_failover at the slot level (this will be persistent
information). Users can set it during the create/alter subscription or
via pg_create_logical_replication_slot(). Also, on physical standby,
there will be a parameter like enable_syncslot. All the physical
standbys that have set enable_syncslot will receive all the logical
slots that are marked as enable_failover. To me, whether to sync a
particular slot is a slot-level property, so defining it in this new
way seems reasonable.

I think this will simplify the scheme a bit but still, the list of
physical standby's for which logical slots wait during decoding needs
to be maintained as we thought. But, how about with the above two
parameters (enable_failover and enable_syncslot), we have
standby_slot_names defined on the primary. That avoids the need to
store the list of standby_slot_names in logical slots and simplifies
the implementation quite a bit, right? Now, one can think if we have a
parameter like 'standby_slot_names' then why do we need
enable_syncslot on physical standby but that will be required to
invoke sync worker which will pull logical slot's information? The
advantage of having standby_slot_names defined on primary is that we
can selectively wait on the subset of physical standbys where we are
syncing the slots. I think this will be something similar to
'synchronous_standby_names' in the sense that the physical standbys
mentioned in standby_slot_names will behave as synchronous copies with
respect to slots and after failover user can switch to one of these
physical standby and others can start following new master/publisher.

Thoughts?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: document the need to analyze partitioned tables
Next
From: Laurenz Albe
Date:
Subject: Re: document the need to analyze partitioned tables