Re: Issues with Quorum Commit - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Issues with Quorum Commit
Date
Msg-id 4CAE0870.7010307@bluegap.ch
Whole thread Raw
In response to Re: Issues with Quorum Commit  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Issues with Quorum Commit
Re: Issues with Quorum Commit
List pgsql-hackers
On 10/07/2010 06:41 PM, Greg Smith wrote:
> The cost of hardware capable of running a database server is a large
> multiple of what you can build an alerting machine for.

You realize you don't need lots of disks nor RAM for a box that only
ACKs? A box with two SAS disks and a BBU isn't that expensive anymore.

> I do not disagree with your theory or reasoning.  But as a practical
> matter, I'm afraid the true cost of the better guarantee you're
> suggesting here is additional code complexity that will likely cause
> this feature to miss 9.1 altogether.  As far as I'm concerned, this
> whole diversion into the topic of quorum commit is only consuming
> resources away from targeting something achievable in the time frame of
> a single release.

So far I've been under the impression that Simon already has the code
for quorum_commit k = 1.

What I'm opposing to is the timeout "feature", which I consider to be
additional code, unneeded complexity and foot-gun.

> You cannot make a single server reliable enough to survive all of
> the things that Murphy's Law will inflict upon it, at any price.

That's exactly what I'm saying applies to two servers as well. And why a
timeout is a bad thing here, because the chance the second nodes fails
as well is there (and is higher than you think, according to Murphy).

> For
> most of the businesses I work with who want sync rep, data is not
> considered safe until the second copy is on storage miles away from the
> original, because they know this too.

Now, that are the people who really need sync rep, yes. What do you
think how happy those businesses were to find out that Postgres is
cheating on them in case of a network outage, for example? Do they
really value (write!) availability more than data safety?

> Silly
> me, I'd only spread them across two adjacent states with different power
> providers!  Not nearly good enough to avoid a correlated failure.

Thanks for sharing this. I hope you didn't loose data.

Regards

Markus Wanner


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Issues with Quorum Commit
Next
From: Robert Haas
Date:
Subject: Re: standby registration (was: is sync rep stalled?)