Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAFiTN-v4atS42nB8fkYBLwbBnL59eazkO+ZtVggEHUEoSRL4ZA@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
On Mon, Dec 11, 2023 at 2:21 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Dec 11, 2023 at 1:47 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Dec 8, 2023 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Dec 6, 2023 at 4:53 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > PFA v43, changes are:
> > > >
> > >
> > > I wanted to discuss 0003 patch about cascading standby's. It is not
> > > clear to me whether we want to allow physical standbys to further wait
> > > for cascading standby to sync their slots. If we allow such a feature
> > > one may expect even primary to wait for all the cascading standby's
> > > because otherwise still logical subscriber can be ahead of one of the
> > > cascading standby. I feel even if we want to allow such a behaviour we
> > > can do it later once the main feature is committed. I think it would
> > > be good to just allow logical walsenders on primary to wait for
> > > physical standbys represented by GUC 'standby_slot_names'. If we agree
> > > on that then it would be good to prohibit setting this GUC on standby
> > > or at least it should be a no-op even if this GUC should be set on
> > > physical standby.
> > >
> > > Thoughts?
> >
> > IMHO, why not keep the behavior consistent across primary and standby?
> >  I mean if it doesn't require a lot of new code/design addition then
> > it should be the user's responsibility.  I mean if the user has set
> > 'standby_slot_names' on standby then let standby also wait for
> > cascading standby to sync their slots?  Is there any issue with that
> > behavior?
> >
>
> Without waiting for cascading standby on primary, it won't be helpful
> to just wait on standby.
>
> Currently logical walsenders on primary waits for physical standbys to
> take changes before they update their own logical slots. But they wait
> only for their immediate standbys and not for cascading standbys.
> Although, on first standby, we do have logic where slot-sync workers
> wait for cascading standbys before they update their own slots (synced
> ones, see patch3). But this does not guarantee that logical
> subscribers on primary will never be ahead of the cascading standbys.
> Let us consider this timeline:
>
> t1: logical walsender on primary waiting for standby1 (first standby).
> t2: physical walsender on standby1 is stuck and thus there is delay in
> sending these changes to standby2 (cascading standby).
> t3: standby1 has taken changes and sends confirmation to primary.
> t4: logical walsender on primary receives confirmation from standby1
> and updates slot, logical subscribers of primary also receives the
> changes.
> t5: standby2 has not received changes yet as physical walsender on
> standby1 is still stuck, slotsync worker still waiting for standby2
> (cascading) before it updates its own slots (synced ones).
> t6: standby2 is promoted to become primary.
>
> Now we are in a state wherein primary, logical subscriber and first
> standby has some changes but cascading standby does not. And logical
> slots on primary were updated w/o confirming if cascading standby has
> taken changes or not. This is a problem and we do not have a simple
> solution for this yet.

Okay, I think that makes sense.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Is WAL_DEBUG related code still relevant today?
Next
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby