Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
Date
Msg-id CALj2ACV1WBqtdwMXV7TCfPiAVdHNF=8n1tiCWpv0yS-mbqeWZg@mail.gmail.com
Whole thread Raw
In response to Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
List pgsql-hackers
On Mon, Jul 25, 2022 at 4:20 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> > 25 июля 2022 г., в 14:29, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> написал(а):
> >
> > Hm, after thinking for a while, I tend to agree with the above
> > approach - meaning, query cancel interrupt processing can completely
> > be disabled in SyncRepWaitForLSN() and process proc die interrupt
> > immediately, this approach requires no GUC as opposed to the proposed
> > v1 patch upthread.
> GUC was proposed here[0] to maintain compatibility with previous behaviour. But I think that having no GUC here is
finetoo. If we do not allow cancelation of unreplicated backends, of course. 
>
> >>
> >> And yes, we need additional complexity - but in some other place. Transaction can also be locally committed in
presenceof a server crash. But this another difficult problem. Crashed server must not allow data queries until LSN of
timelineend is successfully replicated to synchronous_standby_names. 
> >
> > Hm, that needs to be done anyways. How about doing as proposed
> > initially upthread [1]? Also, quoting the idea here [2].
> >
> > Thoughts?
> >
> > [1] https://www.postgresql.org/message-id/CALj2ACUrOB59QaE6=jF2cFAyv1MR7fzD8tr4YM5+OwEYG1SNzA@mail.gmail.com
> > [2] 2) Wait for sync standbys to catch up upon restart after the crash or
> > in the next txn after the old locally committed txn was canceled. One
> > way to achieve this is to let the backend, that's making the first
> > connection, wait for sync standbys to catch up in ClientAuthentication
> > right after successful authentication. However, I'm not sure this is
> > the best way to do it at this point.
>
>
> I think ideally startup process should not allow read only connections in CheckRecoveryConsistency() until WAL is not
replicatedto quorum al least up until new timeline LSN. 

We can't do it in CheckRecoveryConsistency() unless I'm missing
something. Because, the walsenders (required for sending the remaining
WAL to sync standbys to achieve quorum) can only be started after the
server reaches a consistent state, after all walsenders are
specialized backends.



--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: fix typos
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: collate not support Unicode Variation Selector