Re: Issues with Quorum Commit - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Issues with Quorum Commit
Date
Msg-id 1286318440.2025.3353.camel@ebony
Whole thread Raw
In response to Re: Issues with Quorum Commit  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Issues with Quorum Commit
List pgsql-hackers
On Tue, 2010-10-05 at 15:14 -0700, Josh Berkus wrote:

> > I can only presume that Josh wants to prevent us from adopting a
> > design that allows sync against multiple standbys.
> 
> Quorum commit == "X servers need to ack for commit", where X > 1.
> Usually done as "X out of Y servers must ack", but it's not a given that
> the master needs to know how many servers there are, just how many ack'ed.
> 
> And I'm not against it; I'm just pointing out that it gives us some
> issues which we don't have with a single standby, and thus quorum commit
> ought to be treated as a separate feature in 9.1 development.

OK, so I did understand you correctly.

Heikki had argued that a use case existed where Y out of Y (i.e. all)
nodes must acknowledge before we commit. That was the use case that
required us to have standby registration. It was optional in all other
cases.

We should note that Oracle only allows X=1, i.e. first acknowledgement
releases waiter. My patch provides X=1 only and takes advantage of the
simpler in-memory data structures as a result.

> >> The master can not roll back or cancel the transaction. That's 
> >> completely infeasible, the WAL record has been written to local disk 
> >> already. The best it can do is halt and wait for enough standbys to 
> >> appear to fulfill the quorum. The client will hang waiting for the 
> >> COMMIT to finish, and the transaction will appear as in-progress to 
> >> other transactions.
> > 
> > Yes, that point has long been understood. Neither patch does this, and
> > in fact the issue is a completely general one.
> 
> So, in that case, if it's been 10 minutes, and we're still not getting
> ack from standbys, what's the exit strategy for the hapless DBA?
> Practically speaking?  Without restarting the master?
> 
> Last I checked, our goal with synch standby was to increase availablity,
> not decrease it.  This is, however, not an issue with quorum commit, but
> an issue with sync rep in general.

Completely agree. When we had that discussion some months/weeks back, we
spoke about having a timeout. My patch has implemented a timeout,
followed by a COMMIT. That allows increased availability, as you say.

You would also be able to specifically release all/some transactions
from wait state with a simple function pg_cancel_sync_wait() (or similar
name).

> > Could the person that wrote that actually explain what a "specific
> > window of synchronicity" is? I'm not sure whether to agree, or disagree.
> 
> A specific amount of time within which all nodes will be consistent
> regarding that specific transaction.

Certainly no patch offers that. I'm not sure such a possibility exists.
Asking for higher X does make that situation worse.

> >> You start a new one from the latest base backup and let it catch up? 
> >> Possibly modifying the config file in the master to let it know about 
> >> the new standby, if we go down that path. This part doesn't seem 
> >> particularly hard to me.
> > 
> > Agreed, not sure of the issue there.
> 
> See previous post.  The critical phrase is *without restarting the
> master*.  AFAICT, no patch has addressed the need to change the master's
> synch configuration without restarting it.  It's possible that I'm not
> following something, in which case I'd love to have it pointed out.

My patch does not require a restart of the master to add/remove sync rep
nodes. They just come and go as needed. 

I don't think Fujii's patch would have a great problem with that either,
but I can't speak for that with precision.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: knngist - 0.8
Next
From: Selena Deckelmann
Date:
Subject: Submissions for a PostgreSQL track at MySQL Conf 2011: Due October 25