Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Support for N synchronous standby servers - take 2
Date
Msg-id 20150702194458.GH16267@alap3.anarazel.de
Whole thread Raw
In response to Re: Support for N synchronous standby servers - take 2  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On 2015-07-02 11:50:44 -0700, Josh Berkus wrote:
> So there's two parts to this:
> 
> 1. I need to ensure that data is replicated to X places.
> 
> 2. I need to *know* which places data was synchronously replicated to
> when the master goes down.
> 
> My entire point is that (1) alone is useless unless you also have (2).

I think there's a good set of usecases where that's really not the case.

> And do note that I'm talking about information on the replica, not on
> the master, since in any failure situation we don't have the old
> master around to check.

How would you, even theoretically, synchronize that knowledge to all the
replicas? Even when they're temporarily disconnected?

> Say you take this case:
> 
> "2" : { "local_replica", "london_server", "nyc_server" }
> 
> ... which should ensure that any data which is replicated is replicated
> to at least two places, so that even if you lose the entire local
> datacenter, you have the data on at least one remote data center.

> EXCEPT: say you lose both the local datacenter and communication with
> the london server at the same time (due to transatlantic cable issues, a
> huge DDOS, or whatever).  You'd like to promote the NYC server to be the
> new master, but only if it was in sync at the time its communication
> with the original master was lost ... except that you have no way of
> knowing that.

Pick up the phone, compare the lsns, done.

> Given that, we haven't really reduced our data loss potential or
> improved availabilty from the current 1-redundant synch rep.  We still
> need to wait to get the London server back to figure out if we want to
> promote or not.
> 
> Now, this configuration would reduce the data loss window:
> 
> "3" : { "local_replica", "london_server", "nyc_server" }
> 
> As would this one:
> 
> "2" : { "local_replica", "nyc_server" }
> 
> ... because we would know definitively which servers were in sync.  So
> maybe that's the use case we should be supporting?

If you want automated failover you need a leader election amongst the
surviving nodes. The replay position is all they need to elect the node
that's furthest ahead, and that information exists today.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Faster setup_param_list() in plpgsql
Next
From: Josh Berkus
Date:
Subject: Re: Add checksums without --initdb