Re: Synch failover WAS: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Synch failover WAS: Support for N synchronous standby servers - take 2
Date
Msg-id 5595B30B.9030605@agliodbs.com
Whole thread Raw
In response to Re: Support for N synchronous standby servers - take 2  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Synch failover WAS: Support for N synchronous standby servers - take 2
Re: Synch failover WAS: Support for N synchronous standby servers - take 2
List pgsql-hackers
On 07/02/2015 12:44 PM, Andres Freund wrote:
> On 2015-07-02 11:50:44 -0700, Josh Berkus wrote:
>> So there's two parts to this:
>>
>> 1. I need to ensure that data is replicated to X places.
>>
>> 2. I need to *know* which places data was synchronously replicated to
>> when the master goes down.
>>
>> My entire point is that (1) alone is useless unless you also have (2).
> 
> I think there's a good set of usecases where that's really not the case.

Please share!  My plea for usecases was sincere.  I can't think of any.

>> And do note that I'm talking about information on the replica, not on
>> the master, since in any failure situation we don't have the old
>> master around to check.
> 
> How would you, even theoretically, synchronize that knowledge to all the
> replicas? Even when they're temporarily disconnected?

You can't, which is why what we need to know is when the replica thinks
it was last synced from the replica side.  That is, a sync timestamp and
lsn from the last time the replica ack'd a sync commit back to the
master successfully.  Based on that information, I can make an informed
decision, even if I'm down to one replica.

>> ... because we would know definitively which servers were in sync.  So
>> maybe that's the use case we should be supporting?
> 
> If you want automated failover you need a leader election amongst the
> surviving nodes. The replay position is all they need to elect the node
> that's furthest ahead, and that information exists today.

I can do that already.  If quorum synch commit doesn't help us minimize
data loss any better than async replication or the current 1-redundant,
why would we want it?  If it does help us minimize data loss, how?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Exposing PG_VERSION_NUM in pg_config
Next
From: Peter Geoghegan
Date:
Subject: Re: Time to fully remove heap_formtuple() and friends?