Re: Synchronous replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Synchronous replication
Date
Msg-id AANLkTimwA0_RphV-_mNsr3+FN==sb68dtYtAcBLS0+bT@mail.gmail.com
Whole thread Raw
In response to Re: Synchronous replication  (Aidan Van Dyk <aidan@highrise.ca>)
Responses Re: Synchronous replication
List pgsql-hackers
On Wed, Jul 21, 2010 at 9:52 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
> * Fujii Masao <masao.fujii@gmail.com> [100721 03:49]:
>
>> >> The patch provides quorum parameter in postgresql.conf, which
>> >> specifies how many standby servers transaction commit will wait for
>> >> WAL records to be replicated to, before the command returns a
>> >> "success" indication to the client. The default value is zero, which
>> >> always doesn't make transaction commit wait for replication without
>> >> regard to replication_mode. Also transaction commit always doesn't
>> >> wait for replication to asynchronous standby (i.e., replication_mode
>> >> is set to async) without regard to this parameter. If quorum is more
>> >> than the number of synchronous standbys, transaction commit returns
>> >> a "success" when the ACK has arrived from all of synchronous standbys.
>> >
>> > There should be a way to specify "wait for *all* connected standby servers
>> > to acknowledge"
>>
>> Agreed. I'll allow -1 as the valid value of the quorum parameter, which
>> means that transaction commit waits for all connected standbys.
>
> Hm... so if my 1 synchronouse standby is operatign normally, and quarum
> is set to 1, I'll get what I want (commit waits until it's safely on both
> servers).  But what happens if my standby goes bad.  Suddenly the quarum
> setting is ignored (because it's > number of connected standby
> servers?)  Is there a way for me to not allow any commits if the quarum
> setting number of standbies is *not* availble?  Yes, I want my db to
> "halt" in that situation, and yes, alarmbells will be ringing...
>
> In reality, I'm likely to run 2 synchronous slaves, with quarum of 1.
> So 1 slave can fail an dI can still have 2 going.  But if that 2nd slave
> ever failed while the other was down, I definately don't want the master
> to forge on ahead!
>
> Of course, this won't be for everyone, just as the current "just
> connected standbys" isn't for everything either...

Yeah, we need to clear up the detailed design of quorum commit feature,
and reach consensus on that.

How should the synchronous replication behave when the number of connected
standby servers is less than quorum?

1. Ignore quorum. The current patch adopts this. If the ACKs from all  connected standbys have arrived, transaction
commitis successful  even if the number of standbys is less than quorum. If there is no  connected standby, transaction
commitalways is successful without  regard to quorum. 

2. Observe quorum. Aidan wants this. Until the number of connected  standbys has become more than or equal to quorum,
transactioncommit  waits. 

Which is the right behavior of quorum commit? Or we should add new
parameter specifying the behavior of quorum commit?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: David Christensen
Date:
Subject: Re: psql \conninfo command (was: Patch: psql \whoami option)
Next
From: Fujii Masao
Date:
Subject: Re: Synchronous replication