Re: Sync Rep Design - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Sync Rep Design
Date
Msg-id 1293899759.1892.60040.camel@ebony
Whole thread Raw
In response to Re: Sync Rep Design  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Sync Rep Design  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Sat, 2011-01-01 at 05:13 -0800, Jeff Janes wrote:
> On 12/31/10, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Fri, 2010-12-31 at 09:27 +0100, Stefan Kaltenbrunner wrote:
> >
> >> Maybe it has been discussed but I still don't see way it makes any
> >> sense. If I declare a standby a sync standby I better want it sync - not
> >> "maybe sync". consider the case of a 1 master and two identical sync
> >> standbys - one sync standby is in the same datacenter the other is in a
> >> backup location say 15km away.
> >> Given there is a small constant latency to the second box (even if you
> >> have fast networks) the end effect is that the second standby will NEVER
> >> be sync (because the local one will always be faster) and you end up
> >> with an async slave that cannot be used per your business rules?
> >
> > Your picture above is a common misconception. I will add something to
> > the docs to explain this.
> >
> > 1. "sync" is a guarantee about how we respond to the client when we
> > commit. If we wait for more than one response that slows things down,
> > makes the cluster more fragile, complicates the code and doesn't
> > appreciably improve the guarantee.
> 
> Whether it is more fragile depends on if you look at up-time fragility
> or durability fragility.  I think it can appreciably improve the
> guarantee.

Yes, agreed. That is why I proposed quorum commit earlier in 2010, as a
way to improve the durability guarantee. That was bogged down by the
requirement for named servers, which I see as unnecessary.

> > 2. "sync" does not guarantee that the updates to the standbys are in any
> > way coordinated. You can run a query on one standby and get one answer
> > and at the exact same time run the same query on another standby and get
> > a different answer (slightly ahead/behind). That also means that if the
> > master crashes one of the servers will be ahead or behind. You can use
> > pg_last_xlog_receive_location() to check which one that is.
> 
> If at least one of the standbys is in the same smoking crater as the
> primary, then pg_last_xlog_receive_location on it is unlikely to
> respond.
> 
> The guarantee goes away precisely when it is needed.

Fairly obviously, I would not be advocating anything that forced you to
use a server in the "same smoking crater". I can't see any guarantee
that goes away precisely when it is needed.

Perhaps you could explain the issue you see, because your comments seem
unrelated to my point above.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Sync Rep Design
Next
From: Stefan Kaltenbrunner
Date:
Subject: Re: Sync Rep Design