Re: Sync Rep Design - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Sync Rep Design
Date
Msg-id 1293882875.1892.56143.camel@ebony
Whole thread Raw
In response to Re: Sync Rep Design  (Hannu Krosing <hannu@2ndquadrant.com>)
Responses Re: Sync Rep Design  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, 2010-12-31 at 22:18 +0100, Hannu Krosing wrote:
> On 31.12.2010 13:40, Heikki Linnakangas wrote:
> >
> > Sounds good.
> >
> > I still don't like the synchronous_standbys='' and 
> > synchronous_replication=on combination, though. IMHO that still 
> > amounts to letting the standby control the behavior on master, and it 
> > makes it impossible to temporarily add an asynchronous standby to the mix.
> A sync standby _will_have_ the ability to control the master anyway by 
> simply being there or not.
> 
> What is currently proposed is having dual power lines / dual UPS' and 
> working happily on when one of them fails.
> Requiring both of them to be present defeats the original purpose  of 
> doubling them.
> 
> So following Simons design of 2 standbys and only one required to ACK to 
> commit you get 2X reliability of single standby.
...

Yes, working out the math is a good idea. Things are much clearer if we
do that.

Let's assume we have 98% availability on any single server.

1. Having one primary and 2 standbys, either of which can acknowledge,
and we never lock up if both standbys fail, then we will have 99.9992%
server availability. (So PostgreSQL hits "5 Nines", with data
guarantees). ("Maximised availability")

2. Having one primary and 2 standbys, either of which can acknowledge,
and we lock up if both standbys fail to protect the data, then we will
have 99.996% availability. Slightly less availability, but we don't put
data at risk at any time, since any commit is always covered by at least
2 servers. ("Maximised protection")

3. If we have a primary and a single standby which must acknowledge, and
we choose to lock up if the standby fails, then we will have only 96.04%
availability.

4. If we have a primary and two standbys (named or otherwise), both of
which must acknowledge or we lock up the master, then we have an awesome
94.12% availability.

On the last two, there is also an increased likelihood of administrative
cock-ups because of more specific and complex config requirements.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: RIGHT/FULL OUTER hash joins (was Re: small table left outer join big table)
Next
From: Jan Urbański
Date:
Subject: Re: pl/python refactoring