Re: Issues with Quorum Commit - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Issues with Quorum Commit
Date
Msg-id 4CAC91F4.5020903@bluegap.ch
Whole thread Raw
In response to Re: Issues with Quorum Commit  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 10/06/2010 04:20 PM, Simon Riggs wrote:
> Ending the wait state does not cause data loss. It puts you at *risk* of
> data loss, which is a different thing entirely.

These kind of risk scenarios is what sync replication is all about. A
minimum guarantee that doesn't hold in face of the first few failures
(see Jeff's argument) isn't worth a dime.

Keep in mind that upon failure, the other nodes presumably get more
load. As has been seen with RAID, that easily leads to subsequent
failures. Sync rep needs to be able to protect against that *as well*.

> If you want to avoid data loss you use N+k redundancy and get on with
> life, rather than sitting around waiting.

With that notion, I'd argue that quorum_commit needs to be set to
exactly k, because any higher value would only cost performance without
any useful benefit.

But if I want at least k ACKs and if I think it's worth the performance
penalty that brings during normal operation, I want that guarantee to
hold true *especially* in case of an emergency.

If availability is more important, you need to increase N and make sure
enough of these (asynchronously) replicated nodes stay up. Increase k
(thus quorum commit) for a stronger durability guarantee.

> Putting in a feature for people that choose k=0 seems wasteful to me,
> since they knowingly put themselves at risk in the first place.

Given the above logic, k=0 equals to completely async replication. Not
sure what's wrong about that.

Regards

Markus Wanner


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Issues with Quorum Commit
Next
From: Dimitri Fontaine
Date:
Subject: Re: Issues with Quorum Commit