Re: Disallow cancellation of waiting for synchronous replication - Mailing list pgsql-hackers

From Maksim Milyutin
Subject Re: Disallow cancellation of waiting for synchronous replication
Date
Msg-id f3ffc220-e601-cc43-3784-f9bba66dc382@gmail.com
Whole thread Raw
In response to Re: Disallow cancellation of waiting for synchronous replication  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 21.12.2019 00:19, Tom Lane wrote:

>> Three is still a problem when backend is not canceled, but terminated [2].
> Exactly.  If you don't have a fix that handles that case, you don't have
> anything.  In fact, you've arguably made things worse, by increasing the
> temptation to terminate or "kill -9" the nonresponsive session.


I assume that the termination of backend that causes termination of 
PostgreSQL instance in Andrey's patch proposal have to be resolved by 
external HA agents that could interrupt such terminations as parent 
process of postmaster and make appropriate decisions e.g., restart 
PostgreSQL node in closed from external users state (via pg_hba.conf 
manipulation) until all sync replicas synchronize changes from master. 
Stolon HA tool implements this strategy  [1]. This logic (waiting for 
all replicas declared in synchronous_standby_names replicate all WAL 
from master) could be implemented inside PostgreSQL kernel after start 
recovery process before database is opened to users and this can be done 
separately later.

Another approach is to implement two-phase commit over master and sync 
replicas (as it did Oracle in old versions [2]) where the risk to get 
local committed data under instance restarting and query canceling is 
minimal (after starting of final commitment phase). But this approach 
has latency penalty and complexity to resolve partial (prepared but not 
committed) transactions under coordinator (in this case master node) 
failure in automatic mode. Nicely if this approach will be implemented 
later as option of synchronous commit.


1. 

https://github.com/sorintlab/stolon/blob/master/doc/syncrepl.md#handling-postgresql-sync-repl-limits-under-such-circumstances

2. 
https://docs.oracle.com/cd/B28359_01/server.111/b28326/repmaster.htm#i33607

-- 
Best regards,
Maksim Milyutin




pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: table partition and column default
Next
From: Maksim Milyutin
Date:
Subject: Re: Disallow cancellation of waiting for synchronous replication