Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Support for N synchronous standby servers - take 2
Date
Msg-id 55958804.9070002@agliodbs.com
Whole thread Raw
In response to Re: Support for N synchronous standby servers - take 2  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Support for N synchronous standby servers - take 2
Re: Support for N synchronous standby servers - take 2
Re: Support for N synchronous standby servers - take 2
List pgsql-hackers
On 07/02/2015 11:31 AM, Andres Freund wrote:
> On 2015-07-02 11:10:27 -0700, Josh Berkus wrote:
>> If we're always going to be polling the replicas for furthest ahead,
>> then why bother implementing quorum synch at all? That's the basic
>> question I'm asking.  What does it buy us that we don't already have?
> 
> What do those topic have to do with each other? A standby fundamentally
> can be further ahead than what the primary knows about. So you can't do
> very much with that knowledge on the master anyway?
> 
>> I'm serious, here.  Without any additional information on synch state at
>> failure time, I would never use quorum synch.  If there's someone on
>> this thread who *would*, let's speak to their use case and then we can
>> actually get the feature right.  Anyone?
> 
> How would you otherwise ensure that your data is both on a second server
> in the same DC and in another DC? Which is a pretty darn common desire?

So there's two parts to this:

1. I need to ensure that data is replicated to X places.

2. I need to *know* which places data was synchronously replicated to
when the master goes down.

My entire point is that (1) alone is useless unless you also have (2).
And do note that I'm talking about information on the replica, not on
the master, since in any failure situation we don't have the old master
around to check.

Say you take this case:

"2" : { "local_replica", "london_server", "nyc_server" }

... which should ensure that any data which is replicated is replicated
to at least two places, so that even if you lose the entire local
datacenter, you have the data on at least one remote data center.

EXCEPT: say you lose both the local datacenter and communication with
the london server at the same time (due to transatlantic cable issues, a
huge DDOS, or whatever).  You'd like to promote the NYC server to be the
new master, but only if it was in sync at the time its communication
with the original master was lost ... except that you have no way of
knowing that.

Given that, we haven't really reduced our data loss potential or
improved availabilty from the current 1-redundant synch rep.  We still
need to wait to get the London server back to figure out if we want to
promote or not.

Now, this configuration would reduce the data loss window:

"3" : { "local_replica", "london_server", "nyc_server" }

As would this one:

"2" : { "local_replica", "nyc_server" }

... because we would know definitively which servers were in sync.  So
maybe that's the use case we should be supporting?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: CK Tan
Date:
Subject: Re: Memory Accounting v11
Next
From: Josh Berkus
Date:
Subject: Improve testing notes?