Re: Allow logical failover slots to wait on synchronous replication - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Allow logical failover slots to wait on synchronous replication |
Date | |
Msg-id | CAJpy0uAexFn9=W2dTssxrHmAHu2V08Jq6BfFpjDxKUpuMdYKDw@mail.gmail.com Whole thread Raw |
In response to | Re: Allow logical failover slots to wait on synchronous replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Allow logical failover slots to wait on synchronous replication
Re: Allow logical failover slots to wait on synchronous replication |
List | pgsql-hackers |
On Mon, Sep 16, 2024 at 11:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 13, 2024 at 3:13 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Sep 12, 2024 at 3:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Wed, Sep 11, 2024 at 2:40 AM John H <johnhyvr@gmail.com> wrote: > > > > > > > > Hi Shveta, > > > > > > > > On Sun, Sep 8, 2024 at 11:16 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > > > I was trying to have a look at the patch again, it doesn't apply on > > > > > the head, needs rebase. > > > > > > > > > > > > > Rebased with the latest changes. > > > > > > > > > Regarding 'mode = SyncRepWaitMode', FWIW, SyncRepWaitForLSN() also > > > > > does in a similar way. It gets mode in local var initially and uses it > > > > > later. See [1]. So isn't there a chance too that 'SyncRepWaitMode' can > > > > > change in between. > > > > > > > > > > [1]: > > > > > mode = SyncRepWaitMode; > > > > > ..... > > > > > .... > > > > > if (!WalSndCtl->sync_standbys_defined || > > > > > lsn <= WalSndCtl->lsn[mode]) > > > > > { > > > > > LWLockRelease(SyncRepLock); > > > > > return; > > > > > } > > > > > > > > You are right, thanks for the correction. I tried reproducing with GDB > > > > where SyncRepWaitMode > > > > changes due to pg_ctl reload but was unable to do so. It seems like > > > > SIGHUP only sets ConfigReloadPending = true, > > > > which gets processed in the next loop in WalSndLoop() and that's > > > > probably where I was getting confused. > > > > > > yes, SIGHUP will be processed in the caller of > > > StandbySlotsHaveCaughtup() (see ProcessConfigFile() in > > > WaitForStandbyConfirmation()). So we can use 'SyncRepWaitMode' > > > directly as it is not going to change in StandbySlotsHaveCaughtup() > > > even if user triggers the change. And thus it was okay to use it even > > > in the local variable too similar to how SyncRepWaitForLSN() does it. > > > > > > > In the latest patch, I've added: > > > > > > > > Assert(SyncRepWaitMode >= 0); > > > > > > > > which should be true since we call SyncRepConfigured() at the > > > > beginning of StandbySlotsHaveCaughtup(), > > > > and used SyncRepWaitMode directly. > > > > > > Yes, it should be okay I think. As SyncRepRequested() in the beginning > > > makes sure synchronous_commit > SYNCHRONOUS_COMMIT_LOCAL_FLUSH and > > > thus SyncRepWaitMode should be mapped to either of > > > WAIT_WRITE/FLUSH/APPLY etc. Will review further. > > > > > > > I was wondering if we need somethign similar to SyncRepWaitForLSN() here: > > > > /* Cap the level for anything other than commit to remote flush only. */ > > if (commit) > > mode = SyncRepWaitMode; > > else > > mode = Min(SyncRepWaitMode, SYNC_REP_WAIT_FLUSH); > > > > The header comment says: > > * 'lsn' represents the LSN to wait for. 'commit' indicates whether this LSN > > * represents a commit record. If it doesn't, then we wait only for the WAL > > * to be flushed if synchronous_commit is set to the higher level of > > * remote_apply, because only commit records provide apply feedback. > > > > If we don't do something similar, then aren't there chances that we > > keep on waiting on the wrong lsn[mode] for the case when mode = > > SYNC_REP_WAIT_APPLY while sync-rep-wait infrastructure is updating > > different mode's lsn. Is my understanding correct? > > > > I think here we always need the lsn values corresponding to > SYNC_REP_WAIT_FLUSH as we want to ensure that the WAL has to be > flushed on physical standby before sending it to the logical > subscriber. See ProcessStandbyReplyMessage() where we always use > flushPtr to advance the physical_slot via > PhysicalConfirmReceivedLocation(). I agree. So even if the mode is SYNC_REP_WAIT_WRITE (lower one) or SYNC_REP_WAIT_APPLY (higher one), we need to wait for lsn[SYNC_REP_WAIT_FLUSH]. > Another question aside from the above point, what if someone has > specified logical subscribers in synchronous_standby_names? In the > case of synchronized_standby_slots, we won't proceed with such slots. > Yes, it is a possibility. I have missed this point earlier. Now I tried a case where I give a mix of logical subscriber and physical standby in 'synchronous_standby_names' on pgHead, it even takes that 'mix' configuration and starts waiting accordingly. synchronous_standby_names = 'FIRST 2(logicalsub_1, phy_standby_1, phy_standby_2)'; thanks Shveta
pgsql-hackers by date: