Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Support for N synchronous standby servers - take 2 |
Date | |
Msg-id | CAB7nPqQ5dOBFqUL8OfKzjJA7JGf_gqZsO+c8YwWYeZPcXgeH6A@mail.gmail.com Whole thread Raw |
In response to | Re: Support for N synchronous standby servers - take 2 (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Support for N synchronous standby servers - take 2
Re: Support for N synchronous standby servers - take 2 Re: Support for N synchronous standby servers - take 2 Re: Support for N synchronous standby servers - take 2 |
List | pgsql-hackers |
On Thu, Jun 25, 2015 at 8:32 PM, Simon Riggs wrote: > Let's start with a complex, fully described use case then work out how to > specify what we want. Well, one of the most simple cases where quorum commit and this feature would be useful for is that, with 2 data centers: - on center 1, master A and standby B - on center 2, standby C and standby D With the current synchronous_standby_names, what we can do now is ensuring that one node has acknowledged the commit of master. For example synchronous_standby_names = 'B,C,D'. But you know that :) What this feature would allow use to do is for example being able to ensure that a node on the data center 2 has acknowledged the commit of master, meaning that even if data center 1 completely lost for a reason or another we have at least one node on center 2 that has lost no data at transaction commit. Now, regarding the way to express that, we need to use a concept of node group for each element of synchronous_standby_names. A group contains a set of elements, each element being a group or a single node. And for each group we need to know three things when a commit needs to be acknowledged: - Does my group need to acknowledge the commit? - If yes, how many elements in my group need to acknowledge it? - Does the order of my elements matter? That's where the micro-language idea makes sense to use. For example, we can define a group using separators and like (elt1,...eltN) or [elt1,elt2,eltN]. Appending a number in front of a group is essential as well for quorum commits. Hence for example, assuming that '()' is used for a group whose element order does not matter, if we use that: - k(elt1,elt2,eltN) means that we need for the k elements in the set to return true (aka commit confirmation). - k[elt1,elt2,eltN] means that we need for the first k elements in the set to return true. When k is not defined for a group, k = 1. Using only elements separated by commas for the upper group means that we wait for the first element in the set (for backward compatibility), hence: 1(elt1,elt2,eltN) <=> elt1,elt2,eltN We could as well mix each behavior, aka being able to define for a group to wait for the first k elements and a total of j elements in the whole set, but I don't think that we need to go that far. I suspect that in most cases users will be satisfied with only cases where there is a group of data centers, and they want to be sure that one or two in each center has acknowledged a commit to master (performance is not the matter here if centers are not close). Hence in the case above, you could get the behavior wanted with this definition: 2(B,(C,D)) With more data centers, like 3 (wait for two nodes in the 3rd set): 3(B,(C,D),2(E,F,G)) Users could define more levels of group, like that: 2(A,(B,(C,D))) But that's actually something few people would do in real cases. > I'm nervous of "it would be good ifs" because we do a ton of work only to > find a design flaw. That makes sense. Let's continue arguing on it then. -- Michael
pgsql-hackers by date: