Re: Allow logical failover slots to wait on synchronous replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Allow logical failover slots to wait on synchronous replication
Date
Msg-id CAJpy0uBYfVtJEGDM2EtDWUO=zWetUWh-vAO5_D4NJLP67P2X8Q@mail.gmail.com
Whole thread Raw
In response to Re: Allow logical failover slots to wait on synchronous replication  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Allow logical failover slots to wait on synchronous replication
List pgsql-hackers
On Thu, Sep 12, 2024 at 3:04 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Sep 11, 2024 at 2:40 AM John H <johnhyvr@gmail.com> wrote:
> >
> > Hi Shveta,
> >
> > On Sun, Sep 8, 2024 at 11:16 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > >
> > > I was trying to have a look at the patch again, it doesn't apply on
> > > the head, needs rebase.
> > >
> >
> > Rebased with the latest changes.
> >
> > > Regarding 'mode = SyncRepWaitMode', FWIW, SyncRepWaitForLSN() also
> > > does in a similar way. It gets mode in local var initially and uses it
> > > later. See [1]. So isn't there a chance too that 'SyncRepWaitMode' can
> > > change in between.
> > >
> > > [1]:
> > > mode = SyncRepWaitMode;
> > > .....
> > > ....
> > >         if (!WalSndCtl->sync_standbys_defined ||
> > >                 lsn <= WalSndCtl->lsn[mode])
> > >         {
> > >                 LWLockRelease(SyncRepLock);
> > >                 return;
> > >         }
> >
> > You are right, thanks for the correction. I tried reproducing with GDB
> > where SyncRepWaitMode
> > changes due to pg_ctl reload but was unable to do so. It seems like
> > SIGHUP only sets ConfigReloadPending = true,
> > which gets processed in the next loop in WalSndLoop() and that's
> > probably where I was getting confused.
>
> yes, SIGHUP will be processed in the caller of
> StandbySlotsHaveCaughtup() (see ProcessConfigFile() in
> WaitForStandbyConfirmation()). So we can use 'SyncRepWaitMode'
> directly as it is not going to change in StandbySlotsHaveCaughtup()
> even if user triggers the change. And thus it was okay to use it even
> in the local variable too similar to how SyncRepWaitForLSN() does it.
>
> > In the latest patch, I've added:
> >
> > Assert(SyncRepWaitMode >= 0);
> >
> > which should be true since we call SyncRepConfigured() at the
> > beginning of StandbySlotsHaveCaughtup(),
> > and used SyncRepWaitMode directly.
>
> Yes, it should be okay I think. As SyncRepRequested() in the beginning
> makes sure synchronous_commit > SYNCHRONOUS_COMMIT_LOCAL_FLUSH and
> thus SyncRepWaitMode should be mapped to either of
> WAIT_WRITE/FLUSH/APPLY etc. Will review further.
>

I was wondering if we need somethign similar to SyncRepWaitForLSN() here:

        /* Cap the level for anything other than commit to remote flush only. */
        if (commit)
                mode = SyncRepWaitMode;
        else
                mode = Min(SyncRepWaitMode, SYNC_REP_WAIT_FLUSH);

The header comment says:
 * 'lsn' represents the LSN to wait for.  'commit' indicates whether this LSN
 * represents a commit record.  If it doesn't, then we wait only for the WAL
 * to be flushed if synchronous_commit is set to the higher level of
 * remote_apply, because only commit records provide apply feedback.

If we don't do something similar, then aren't there chances that we
keep on waiting on the wrong lsn[mode] for the case when mode =
SYNC_REP_WAIT_APPLY while sync-rep-wait infrastructure is updating
different mode's lsn. Is my understanding correct?

thanks
Shveta



pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN
Next
From: Ashutosh Bapat
Date:
Subject: Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN