Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From shveta malik
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAJpy0uAFQ0oaBMSXquYE_PTsH+pLM8cBdT8SswjG=hqa8XTcqw@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
On Tue, Nov 14, 2023 at 7:56 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On 11/13/23 2:57 PM, Zhijie Hou (Fujitsu) wrote:
> > On Friday, November 10, 2023 4:16 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote:
> >> Yeah good point, agree to just error out in all the case then (if we discard the
> >> sync_ reserved wording proposal, which seems to be the case as probably not
> >> worth the extra work).
> >
> > Thanks for the discussion!
> >
> > Here is the V33 patch set which includes the following changes:
>
> Thanks for working on it!
>
> >
> > 1) Drop slots with state 'i' in promotion flow after we shut down WalReceiver.
>
> @@ -3557,10 +3558,15 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
>                       * this only after failure, so when you promote, we still
>                       * finish replaying as much as we can from archive and
>                       * pg_wal before failover.
> +                    *
> +                    * Drop the slots for which sync is initiated but not yet
> +                    * completed i.e. they are still waiting for the primary
> +                    * server to catch up.
>                       */
>                      if (StandbyMode && CheckForStandbyTrigger())
>                      {
>                          XLogShutdownWalRcv();
> +                       slotsync_drop_initiated_slots();
>                          return XLREAD_FAIL;
>                      }
>
> I had a closer look and it seems this is not located at the right place.
>
> Indeed, it's added here:
>
> switch (currentSource)
> {
>         case XLOG_FROM_ARCHIVE:
>         case XLOG_FROM_PG_WAL:
>
> While in our case we are in
>
>         case XLOG_FROM_STREAM:
>
> So I think we should move slotsync_drop_initiated_slots() in the
> XLOG_FROM_STREAM case. Maybe before shutting down the sync slot worker?
> (the TODO item number 2 you mentioned up-thread)
>
> BTW in order to prevent any corner case, would'nt also be better to
>
> replace:
>
> +   /*
> +    * Do not allow consumption of a "synchronized" slot until the standby
> +    * gets promoted.
> +    */
> +   if (RecoveryInProgress() && (slot->data.sync_state != SYNCSLOT_STATE_NONE))
>
> with something like:
>
> if ((RecoveryInProgress() && (slot->data.sync_state != SYNCSLOT_STATE_NONE)) || slot->data.sync_state ==
SYNCSLOT_STATE_INITIATED)
>
> to ensure slots in 'i' case can never be used?
>

Yes, it makes sense. WIll do it.

> Regards,
>
> --
> Bertrand Drouvot
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Random pg_upgrade test failure on drongo
Next
From: Michael Paquier
Date:
Subject: Re: Adding facility for injection points (or probe points?) for more advanced tests