Re: Synchronous commit behavior during network outage - Mailing list pgsql-hackers

From Ondřej Žižka
Subject Re: Synchronous commit behavior during network outage
Date
Msg-id 70850d05-ec84-d6bc-10b5-3b23d3f36925@stratox.cz
Whole thread Raw
In response to Re: Synchronous commit behavior during network outage  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
On 06/05/2021 06:09, Andrey Borodin wrote:
> I could not understand your reasoning about 2 and 4 nodes. Can you please clarify a bit how 4 node setup can help
preventvisibility of commited-locall-but-canceled transactions?
 
Hello Andrey,

The initial request (for us) was to have a geo cluster with 2 locations 
where would be possible to have 2 sync replicas even in case of failure 
of one location. This means to have 2 nodes in every location (4 
together). If one location fails completely (broken network connection), 
Patroni will choose the working location (5 node etcd in 3 locations to 
ensure this).

In the initial state, there is 1 sync replica in each location and one 
async replica in each location using as a source the sync replica in its 
location.
Let's have the following initial situation:
1) Nodes pg11 and pg12 are in one location nodes pg21 and pg22 are in 
another location.
2) Nodes pg11 and pg21 are in sync replica
3) Node pg12 is an async replica from pg11
4) Node pg22 is an async replica from pg21
5) Master is pg11.

When the commited-locally-but-canceled situation happens and there is a 
problem only with node pg21 (not with the network between nodes), the 
async replica pg12 will receive the local commit from pg11 just after 
the local commit on pg11 even if the cancellation happens. So there will 
be a situation when the commit is present on both pg11 and pg12. If the 
pg11 fails, the transaction already exists on pg12 and this node will be 
selected as a new leader (latest LSN).

There is a period between the time it is committed and the time it will 
have been sent to the async replica when we can lose data, but I expect 
this in milliseconds (maybe less).

It will not prevent visibility but will ensure, that the data would not 
be lost and in that case, data can be visible on the leader even if they 
are not present on the sync replica because there is ensured the 
continuity of the data persistence in the async replica.

I hope I explained it understandably.

Regards
Ondrej




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug in query rewriter - hasModifyingCTE not getting set
Next
From: Bharath Rupireddy
Date:
Subject: Re: Alias collision in `refresh materialized view concurrently`