On Tue, Mar 01, 2022 at 11:10:09AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 1, 2022 at 12:27 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> My feedback is specifically about this behavior. I don't think we should
>> spin in XLogSend*() waiting for an LSN to be synchronously replicated. I
>> think we should just choose the SendRqstPtr based on what is currently
>> synchronously replicated.
>
> Do you mean something like the following?
>
> /* Main loop of walsender process that streams the WAL over Copy messages. */
> static void
> WalSndLoop(WalSndSendDataCallback send_data)
> {
> /*
> * Loop until we reach the end of this timeline or the client requests to
> * stop streaming.
> */
> for (;;)
> {
> if (am_async_walsender && there_are_sync_standbys)
> {
> XLogRecPtr SendRqstLSN;
> XLogRecPtr SyncFlushLSN;
>
> SendRqstLSN = GetFlushRecPtr(NULL);
> LWLockAcquire(SyncRepLock, LW_SHARED);
> SyncFlushLSN = walsndctl->lsn[SYNC_REP_WAIT_FLUSH];
> LWLockRelease(SyncRepLock);
>
> if (SendRqstLSN > SyncFlushLSN)
> continue;
> }
Not quite. Instead of "continue", I would set SendRqstLSN to SyncFlushLSN
so that the WAL sender only sends up to the current synchronously
replicated LSN. TBH there are probably other things that need to be
considered (e.g., how do we ensure that the WAL sender sends the rest once
it is replicated?), but I still think we should avoid spinning in the WAL
sender waiting for WAL to be replicated.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com