Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAJpy0uDEMo33g2cRJ1RhN-=U8jP7Jkh+k4Y9sCiADGfQ6m_EyQ@mail.gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby |
List | pgsql-hackers |
On Mon, Dec 11, 2023 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 8, 2023 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Dec 6, 2023 at 4:53 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > PFA v43, changes are: > > > > > > > I wanted to discuss 0003 patch about cascading standby's. It is not > > clear to me whether we want to allow physical standbys to further wait > > for cascading standby to sync their slots. If we allow such a feature > > one may expect even primary to wait for all the cascading standby's > > because otherwise still logical subscriber can be ahead of one of the > > cascading standby. I feel even if we want to allow such a behaviour we > > can do it later once the main feature is committed. I think it would > > be good to just allow logical walsenders on primary to wait for > > physical standbys represented by GUC 'standby_slot_names'. If we agree > > on that then it would be good to prohibit setting this GUC on standby > > or at least it should be a no-op even if this GUC should be set on > > physical standby. > > > > Thoughts? > > IMHO, why not keep the behavior consistent across primary and standby? > I mean if it doesn't require a lot of new code/design addition then > it should be the user's responsibility. I mean if the user has set > 'standby_slot_names' on standby then let standby also wait for > cascading standby to sync their slots? Is there any issue with that > behavior? > Without waiting for cascading standby on primary, it won't be helpful to just wait on standby. Currently logical walsenders on primary waits for physical standbys to take changes before they update their own logical slots. But they wait only for their immediate standbys and not for cascading standbys. Although, on first standby, we do have logic where slot-sync workers wait for cascading standbys before they update their own slots (synced ones, see patch3). But this does not guarantee that logical subscribers on primary will never be ahead of the cascading standbys. Let us consider this timeline: t1: logical walsender on primary waiting for standby1 (first standby). t2: physical walsender on standby1 is stuck and thus there is delay in sending these changes to standby2 (cascading standby). t3: standby1 has taken changes and sends confirmation to primary. t4: logical walsender on primary receives confirmation from standby1 and updates slot, logical subscribers of primary also receives the changes. t5: standby2 has not received changes yet as physical walsender on standby1 is still stuck, slotsync worker still waiting for standby2 (cascading) before it updates its own slots (synced ones). t6: standby2 is promoted to become primary. Now we are in a state wherein primary, logical subscriber and first standby has some changes but cascading standby does not. And logical slots on primary were updated w/o confirming if cascading standby has taken changes or not. This is a problem and we do not have a simple solution for this yet. thanks Shveta
pgsql-hackers by date: