Re: Synchronization levels in SR - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Synchronization levels in SR |
Date | |
Msg-id | 1274881067.6203.3024.camel@ebony Whole thread Raw |
In response to | Re: Synchronization levels in SR (Fujii Masao <masao.fujii@gmail.com>) |
Responses |
Re: Synchronization levels in SR
|
List | pgsql-hackers |
On Wed, 2010-05-26 at 18:52 +0900, Fujii Masao wrote: > > To summarise, I think we can get away with just 3 parameters: > > synchronous_replication = N # similar in name to synchronous_commit > > synch_rep_timeout = T > > synch_rep_timeout_action = commit | abort > > I agree to add the latter two parameters, which are also listed on > my outline of SynchRep. > http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability > > > Conceptually, this is "I want at least N replica copies made of my > > database changes, I will wait for up to T milliseconds to get that > > otherwise I will do X". Very easy and clear for an application to > > understand what guarantees it is requesting. Also very easy for the > > administrator to understand the guarantees requested and how to > > provision for them: to deliver robustness they typically need N+1 > > servers, or for even higher levels of robustness and performance N+2 > > etc.. > > I don't feel that "synchronous_replication" approach is intuitive for > the administrator. Even on this thread, some people seem to prefer > "per-standby" setting. Maybe they do, but that is because nobody has yet explained how you would handle failure modes with per-standby settings. When you do they will likely change their minds. Put the whole story on the table before trying to force a decision. > Without "per-standby" setting, when there are two standbys, one is in > the near rack and another is in remote site, "synchronous_replication=1" > cannot guarantee that the near standby is always synch with the master. > So when the master goes down, unfortunately we might have to failover to > the remote standby. If the remote server responded first, then that proves it is a better candidate for failover than the one you think of as near. If the two standbys vary over time then you have network problems that will directly affect the performance on the master; synch_rep = N would respond better to any such problems. > OTOH, "synchronous_replication=2" degrades the > performance on the master very much. Yes, but only because you have only one near standby. It would clearly to be foolish to make this setting without 2+ near standbys. We would then have 4 or more servers; how do we specify everything for that config?? > "synchronous_replication" approach > doesn't seem to cover the typical use case. You described the failure modes for the quorum proposal, but avoided describing the failure modes for the "per-standby" proposal. Please explain what will happen when the near server is unavailable, with per-standby settings. Please also explain what will happen if we choose to have 4 or 5 servers to maintain performance in case of the near server going down. How will we specify the failure modes? > Also, when "synchronous_replication=1" and one of synchronous standbys > goes down, how should the surviving standby catch up with the master? > Such standby might be too far behind the master. The transaction commit > should wait for the ACK from the lagging standby immediately even if > there might be large gap? If yes, "synch_rep_timeout" would screw up > the replication easily. That depends upon whether we send the ACK at point #2, #3 or #4. It would only cause a problem if you waited until #4. I've explained why I have made the proposals I've done so far: reduced complexity in failure modes and better user control. To understand that better, you or somebody needs to explain how we would handle the failure modes with "per-standby" settings so we can compare. -- Simon Riggs www.2ndQuadrant.com
pgsql-hackers by date: