Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers) - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)
Date
Msg-id 20220301060528.GA1026683@nathanxps13
Whole thread Raw
In response to Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Allow async standbys wait for sync replication
List pgsql-hackers
On Tue, Mar 01, 2022 at 11:10:09AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 1, 2022 at 12:27 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> My feedback is specifically about this behavior.  I don't think we should
>> spin in XLogSend*() waiting for an LSN to be synchronously replicated.  I
>> think we should just choose the SendRqstPtr based on what is currently
>> synchronously replicated.
> 
> Do you mean something like the following?
> 
> /* Main loop of walsender process that streams the WAL over Copy messages. */
> static void
> WalSndLoop(WalSndSendDataCallback send_data)
> {
>     /*
>      * Loop until we reach the end of this timeline or the client requests to
>      * stop streaming.
>      */
>     for (;;)
>     {
>         if (am_async_walsender && there_are_sync_standbys)
>         {
>              XLogRecPtr SendRqstLSN;
>              XLogRecPtr SyncFlushLSN;
> 
>             SendRqstLSN = GetFlushRecPtr(NULL);
>             LWLockAcquire(SyncRepLock, LW_SHARED);
>             SyncFlushLSN = walsndctl->lsn[SYNC_REP_WAIT_FLUSH];
>             LWLockRelease(SyncRepLock);
> 
>             if (SendRqstLSN > SyncFlushLSN)
>                continue;
>         }

Not quite.  Instead of "continue", I would set SendRqstLSN to SyncFlushLSN
so that the WAL sender only sends up to the current synchronously
replicated LSN.  TBH there are probably other things that need to be
considered (e.g., how do we ensure that the WAL sender sends the rest once
it is replicated?), but I still think we should avoid spinning in the WAL
sender waiting for WAL to be replicated.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file
Next
From: Yugo NAGATA
Date:
Subject: pipeline mode and commands not allowed in a transaction block