Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers) - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)
Date
Msg-id CALj2ACUEQPjDCoEm1NgDxPYqeKcfoAHe86ZWUaD2pjMjdxsWHw@mail.gmail.com
Whole thread Raw
In response to Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Sat, Feb 26, 2022 at 9:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Sat, Feb 26, 2022 at 02:17:50PM +0530, Bharath Rupireddy wrote:
> > A global min LSN of SendRqstPtr of all the sync standbys can be
> > calculated and the async standbys can send WAL up to global min LSN.
> > This is unlike what the v1 patch does i.e. async standbys will wait
> > until the sync standbys report flush LSN back to the primary. Problem
> > with the global min LSN approach is that there can still be a small
> > window where async standbys can get ahead of sync standbys. Imagine
> > async standbys being closer to the primary than sync standbys and if
> > the failover has to happen while the WAL at SendRqstPtr isn't received
> > by the sync standbys, but the async standbys can receive them as they
> > are closer. We hit the same problem that we are trying to solve with
> > this patch. This is the reason, we are waiting till the sync flush LSN
> > as it guarantees more transactional protection.
>
> Do you mean that the application of WAL gets ahead on your async standbys
> or that the writing/flushing of WAL gets ahead?  If synchronous_commit is
> set to 'remote_write' or 'on', I think either approach can lead to
> situations where the async standbys are ahead of the sync standbys with WAL
> application.  For example, a conflict between WAL replay and a query on
> your sync standby could delay WAL replay, but the primary will not wait for
> this conflict to resolve before considering a transaction synchronously
> replicated and sending it to the async standbys.
>
> If writing/flushing WAL gets ahead on async standbys, I think something is
> wrong with the patch.  If you aren't sending WAL to async standbys until
> it is synchronously replicated to the sync standbys, it should by
> definition be impossible for this to happen.

With the v1 patch [1], the async standbys will never get WAL ahead of
sync standbys. That is guaranteed because the walsenders serving async
standbys are allowed to send WAL only after the walsenders serving
sync standbys receive the synchronous flush LSN.

> > Do you think allowing async standbys optionally wait for either remote
> > write or flush or apply or global min LSN of SendRqstPtr so that users
> > can choose what they want?
>
> I'm not sure I follow the difference between "global min LSN of
> SendRqstPtr" and remote write/flush/apply.  IIUC you are saying that we
> could use the LSN of what is being sent to sync standbys instead of the LSN
> of what the primary considers synchronously replicated.  I don't think we
> should do that because it provides no guarantee that the WAL has even been
> sent to the sync standbys before it is sent to the async standbys.

Correct.

> For
> this feature, I think we always need to consider what the primary considers
> synchronously replicated.  My suggested approach doesn't change that.  I'm
> saying that instead of spinning in a loop waiting for the WAL to be
> synchronously replicated, we just immediately send WAL up to the LSN that
> is presently known to be synchronously replicated.

As I said above, v1 patch does that i.e. async standbys wait until the
sync standbys update their flush LSN.

Flush LSN is this - flushLSN = walsndctl->lsn[SYNC_REP_WAIT_FLUSH];
which gets updated in SyncRepReleaseWaiters.

Async standbys with their SendRqstPtr will wait in XLogSendPhysical or
XLogSendLogical until SendRqstPtr <= flushLSN.

I will address review comments raised by Hsu, John and send the
updated patch for further review. Thanks.

[1] https://www.postgresql.org/message-id/CALj2ACVUa8WddVDS20QmVKNwTbeOQqy4zy59NPzh8NnLipYZGw%40mail.gmail.com

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Next
From: Amit Kapila
Date:
Subject: Re: Add the replication origin name and commit-LSN to logical replication worker errcontext