Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAA4eK1KAC1TPZStUXjMi0gd-OCRTY04JgY3cyKUXyq3Ndt32CQ@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Sun, Jul 9, 2023 at 1:01 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > 1. Can you please try to explain the functionality of the overall
> > patch somewhere in the form of comments and or commit message?
>
> IIUC, there are 2 core ideas of the feature:
>
> 1) It will never let the logical replication subscribers go ahead of
> physical replication standbys specified in standby_slot_names. It
> implements this by delaying decoding of commit records on the
> walsenders corresponding to logical replication subscribers on the
> primary until all the specified standbys confirm receiving the commit
> LSN.
>
> 2) The physical replication standbys will synchronize data of the
> specified logical replication slots (in synchronize_slot_names) from
> the primary, creating the logical replication slots if necessary.
> Since the logical replication subscribers will never go out of
> physical replication standbys, the standbys can safely synchronize the
> slots and keep the data necessary for subscribers to connect to it and
> work seamlessly even after a failover.
>
> If my understanding is right,
>

This matches my understanding as well.

> I have few thoughts here:
>
> 1. All the logical walsenders are delayed on the primary - per
> wait_for_standby_confirmation() despite the user being interested in
> only a few of them via synchronize_slot_names. Shouldn't the delay be
> for just the slots specified in synchronize_slot_names?
> 2. I think we can split the patch like this - 0001 can be the logical
> walsenders delaying decoding on the primary unless standbys confirm,
> 0002 standby synchronizing the logical slots.
>

Agreed with the above two points.

> 3. I think we need to change the GUC standby_slot_names to better
> reflect what it is used for - wait_for_replication_slot_names or
> wait_for_
>

I feel at this stage we can focus on getting the design and
implementation correct. We can improve GUC names later once we are
confident that the functionality is correct.

> 4. It allows specifying logical slots in standby_slot_names, meaning,
> it can disallow logical slots getting ahead of other logical slots
> specified in standby_slot_names. Should we allow this case with the
> thinking that if there's anyone using logical replication for failover
> (well, will anybody do that in production?).
>

I think on the contrary we should prohibit this case. We can always
extend this functionality later.

> 5. Similar to above, it allows specifying physical slots in
> synchronize_slot_names. Should we disallow?
>

We should prohibit that as well.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Next
From: Michael Paquier
Date:
Subject: Re: Add more sanity checks around callers of changeDependencyFor()