On Sat, Jan 1, 2011 at 6:54 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Yes, working out the math is a good idea. Things are much clearer if we
> do that.
>
> Let's assume we have 98% availability on any single server.
>
> 1. Having one primary and 2 standbys, either of which can acknowledge,
> and we never lock up if both standbys fail, then we will have 99.9992%
> server availability. (So PostgreSQL hits "5 Nines", with data
> guarantees). ("Maximised availability")
I don't agree with this math. If the master and one standby fail
simultaneously, the other standby is useless, because it may or may
not be caught up with the master. You know that the last transaction
acknowledged as committed by the master is on at least one of the two
standbys, but you don't know which one, and so you can't safely
promote the surviving standby.
(If you are working in an environment where promoting the surviving
standby when it's possibly not caught up is OK, then you don't need
sync rep in the first place: you can just run async rep and get much
better performance.)
So the availability is 98% (you are up when the master is up) + 98%^2
* 2% (you are up when both slaves are up and the master is down) =
99.92%. If you had only a single standby, then you could be certain
that any commit acknowledged by the master was on that standby. Thus
your availability would be 98% (up when master is up) + 98% * 2% (you
are up when the master is down and the slave is up) = 99.96%.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company