Re: Disallow cancellation of waiting for synchronous replication - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: Disallow cancellation of waiting for synchronous replication
Date
Msg-id 323FC6A9-2DDA-44EF-AAF5-B18A161E2735@yandex-team.ru
Whole thread Raw
In response to Re: Disallow cancellation of waiting for synchronous replication  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Disallow cancellation of waiting for synchronous replication  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers

> 2 янв. 2020 г., в 19:13, Robert Haas <robertmhaas@gmail.com> написал(а):
>
> On Sun, Dec 29, 2019 at 4:13 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>> Not loosing data - is a nice property of the database either.
>
> Sure, but there's more than one way to fix that problem, as I pointed
> out in my first response.
Sorry, it took some more reading iterations of your message for me to understand the problem you are writing about.

You proposed two solutions:
1. Client analyze warning an understand that data is not actually committed. This, as you pointed out, does not solve
theproblem: data is lost for another client, who never saw the warning. 
Actually, "client" is a stateless number of connections unable to communicate with each other by any means beside
database.They cannot share information about not committed transactions (they would need a database, thus chicken and
theegg problem). 

2. Add another message "CANCEL --force" to stop synchronous replication for specific backend.
We already have a way to stop synchronous replication "alter system set synchronous_standby_names to
'working.stand.by';select pg_reload_conf();". This will stop it for every backend, but "CANCEL --force" will be more
picky.
User still can loose data when they issue idempotent query based on data, committed by "CANCEL --force". Moreover, user
canloose data if his upsert is based on data committed by someone else with "set synchronous_commit to off". 
We could fix upserts: make them wait for replication even if nothing was changed, but this will not cover the case when
useris doing SELECT and decides not to insert anything. 
We can fix SELECT: if user asks for synchronous_commit=remote_write - give him snapshot no newer than synchronously
committeddata. ISTM this would solve all above problems, but I do not see implications of this approach. We should add
allXIDs to XIP if their commit LSN > sync rep LSN. But I'm not sure all other transactional mechanics will be OK with
this.

From practical point of view - when all writing queries use same synchronous_commit level - easiest solution is to just
disallowcancel of sync replication. In psql we can just reset connection on second CTRL+C. That's more generic than
"CANCEL--force". 

When all queries runs with same synchronous_commit there is no point in protocol message for canceling sync rep for
singleconnection. Just drop that connection. Ignoring cancel is the only way to satisfy synchronous_commit level, which
isconstant for transaction. 
When queries run in various synchronous_commit - things are much more complicated. Adding protocol message to change
synchronous_commitfor running queries does not seems to be a viable option. 

> I continue to think that the root cause of this issue is that we can't
> distinguish between cancelling the query and cancelling the sync rep
> wait.
Yes, it is. But canceling sync rep wait exists already. Just change synchronous_stanby_names. Canceling sync rep for
oneclient - is, effectively, changing synchronous commit level for running transaction. It opens a way for way more
difficultcomplications. 

> The client in this case is asking for both when it really only
> wants the former, and then ignoring the warning that the latter is
> what actually occurred.
Client is not ignoring warnings. Data is lost for the client which never received warning. If we could just fix our
code,I would not be making so much noise. There are workarounds, but they are very pleasant to explain. 


Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: rmrodriguez@carto.com
Date:
Subject: Re: avoid some calls to memset with array initializer
Next
From: Robert Haas
Date:
Subject: Re: Building infrastructure for B-Tree deduplication that recognizeswhen opclass equality is also equivalence