Re: Disallow cancellation of waiting for synchronous replication - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: Disallow cancellation of waiting for synchronous replication
Date
Msg-id 54AA38F5-4C96-4D1A-8C9A-BC41B211AF8D@yandex-team.ru
Whole thread Raw
In response to Re: Disallow cancellation of waiting for synchronous replication  (Aleksander Alekseev <aleksander@timescale.com>)
Responses Automatic notification for top transaction IDs  (Gurjeet Singh <gurjeet@singh.im>)
List pgsql-hackers
Hi Aleksander!

Thanks for looking into this.

> 23 апр. 2021 г., в 14:30, Aleksander Alekseev <aleksander@timescale.com> написал(а):
>
> Hi hackers,
>
>>>> After using a patch for a while it became obvious that PANICing during termination is not a good idea. Even when
wewait for synchronous replication. It generates undesired coredumps. 
>>>> I think in presence of SIGTERM it's reasonable to say that we cannot protect user anymore.
>>>> PFA v3.
>
> This patch, although solving a concrete and important problem, looks
> more like a quick workaround than an appropriate solution. Or is it
> just me?
>
> Ideally, the transaction should be committed only after getting a
> reply from the standby.
Getting reply from the standby is a part of a commit. Commit is completed only when WAL reached standby. Commit,
certainly,was initiated before getting reply from standby. We cannot commit only after we commit. 

> If the user cancels the transaction, it
> doesn't get committed anywhere.
The problem is user tries to cancel a transaction after they asked for commit. We never promised rolling back committed
transaction.
When user asks for commit we insert commit record into WAL. And then wait when it is acknowledged by quorum of standbys
andlocal storage. 
We cannot discard this record on standbys. Or, at one point we will have to discard discard records. Or discard discard
discardrecords. 

> This is what people into distributed
> systems would expect unless stated otherwise, at least.
I think, our transaction semantics is stated clearly in documentation.

> Although I
> realize how complicated it is to implement, especially considering all
> the possible corner cases (netsplit right after getting a reply, etc).
> Maybe we could come up with a less than ideal, but still sound and
> easy-to-understand model, which, as soon as you learned it, doesn't
> bring unexpected surprises to the user.
The model proposed by my patch sounds as follows:
transaction effects should not be observable on primary until requirements of synchronous_commit are satisfied.

E.g. even if user issues cancel of committed locally transaction, we should not release locks held by this transaction.
What unexpected surprises do you see in this model?

Thanks for reviewing!

Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Next
From: Bharath Rupireddy
Date:
Subject: How to test Postgres for any unaligned memory accesses?