Re: Synchronization levels in SR - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Synchronization levels in SR
Date
Msg-id 4BFD6225.70009@enterprisedb.com
Whole thread Raw
In response to Re: Synchronization levels in SR  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 26/05/10 20:40, Simon Riggs wrote:
> On Wed, 2010-05-26 at 19:55 +0300, Heikki Linnakangas wrote:
>> If you set quorum to 1, it also becomes critical
>> infrastructure, because it's possible that a transaction has been
>> replicated to the test server but not the real production standby, and
>> a meteor strikes.
>
> Why would you not want to use the test server?

Because your failover procedures known nothing about the test server. 
Even if the data is there in theory, it'd be completely impractical to 
fetch it from there.

> If its the only thing
> left protecting you, and you wish to be protected, then it sounds very
> cool to me.  In my proposal this test server only gets data ahead of
> other things if the "real production standby" responds too slowly.

There's many reasons why a test server could respond faster than the 
production standby. Maybe the standby is on a different continent. Maybe 
you have fsync=off on the test server because it's just a test server. 
Either way, you want the master to ignore it for the purpose of 
determining if a commit is safe.

> It scares the **** out of people that a DBA can take down a server and
> suddenly the sync protection you thought you had is turned off.

Yeah, it depends on what you're trying to accomplish. If durability is 
absolutely critical to you, (vs. availability), you don't want the 
commit to ever be acknowledged to the client until it's safely flushed 
to disk in the standby, even if it means refusing any further commits on 
the master, until the standby reconnects and catches up.

OTOH, if you're not that worried about durability, but you're load 
balancing queries to the standby, you want to ensure that when you run a 
query against the standby, a transaction that committed on the master is 
also visible in the standby. In that scenario, if a standby can't be 
reached, it is simply pronounced dead, and the master can just ignore it 
until it reconnects.

> That way
> of doing things means an application never knows the protection level
> any piece of data has had. App designers want to be able to marks things
> "handle with care" or "just do it quick, don't care much".

Yeah, that's useful too.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Synchronization levels in SR
Next
From: "Kevin Grittner"
Date:
Subject: Fwd: Re: [BUGS] dividing money by money