Thread: terms for database replication: synchronous vs eager
Hi, I'm asking for advice and hints regarding terms in database replication, especially WRT Postgres-R. (Sorry for crossposting, but I fear not reaching enough people on the Postgres-R ML alone) I'm struggling on how to classify the Postgres-R algorithm. Up until recently, most people thought of it as synchronous replication, but it's not synchronous in the strong (and very common) sense. I.e. after a node confirms to have committed a transaction, other nodes didn't necessarily commit already. (They only promise that they *will* commit without conflicts). This violates the common understanding of synchrony, because you can't commit on a node A and then query another node B and expect it be coherent immediately. None the less, Postgres-R is eager (or pessimistic?) in the sense that it replicates *before* committing, so as to avoid divergence. In [1] I've tried to make that distinction clear, and I'm currently advocating for using synchronous only in the very strong (and commonly used) sense. I've choosen the word 'eager' to mean 'replicates before committing'. According to that definitions, Postgres-R is async but eager. Do these definitions violate any common meaning? Maybe in other areas like distributed storage or lock managers? Regards Markus [1]: Terms and Definitions of Database Replication http://www.postgres-r.org/documentation/terms
On 9/7/2007 11:01 AM, Markus Schiltknecht wrote: > Hi, > > I'm asking for advice and hints regarding terms in database replication, > especially WRT Postgres-R. (Sorry for crossposting, but I fear not > reaching enough people on the Postgres-R ML alone) > > I'm struggling on how to classify the Postgres-R algorithm. Up until > recently, most people thought of it as synchronous replication, but it's > not synchronous in the strong (and very common) sense. I.e. after a node > confirms to have committed a transaction, other nodes didn't necessarily > commit already. (They only promise that they *will* commit without > conflicts). > > This violates the common understanding of synchrony, because you can't > commit on a node A and then query another node B and expect it be > coherent immediately. That's right. And there is no guarantee about the lag at all. So you can find "old" data on node B long after you committed a change to node A. > None the less, Postgres-R is eager (or pessimistic?) in the sense that > it replicates *before* committing, so as to avoid divergence. In [1] > I've tried to make that distinction clear, and I'm currently advocating > for using synchronous only in the very strong (and commonly used) sense. > I've choosen the word 'eager' to mean 'replicates before committing'.>> According to that definitions, Postgres-R is asyncbut eager. Postgres-R is an asynchronous replication system by all means. It only makes sure that the workset data (that's what Postgres-R calls the replication log for one transaction) has been received by a group communication system supporting total order and that the group communication system decided it to be the transaction that (logically) happened before any possibly conflicting concurrent transaction. This is the wonderful idea how Postgres-R will have a failsafe conflict resolution mechanism in an asynchronous system. I don't know what you associate with the word "eager". All I see is that Postgres-R makes sure that some other process, which might still reside on the same hardware as the DB, is now in charge of delivery. Nobody said that the GC implementation cannot have made the decision about the total order of two workset messages and already reported that to the local client application before those messages ever got transmitted over the wire. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Hello Jan, thank you for your feedback. Jan Wieck wrote: > On 9/7/2007 11:01 AM, Markus Schiltknecht wrote: >> This violates the common understanding of synchrony, because you can't >> commit on a node A and then query another node B and expect it be >> coherent immediately. > > That's right. And there is no guarantee about the lag at all. So you can > find "old" data on node B long after you committed a change to node A. I'm in doubt about the "long after". In practice you'll mostly have nodes which perform about equally fast. And as the origin node has to do more processing, than a node which solely replays a transaction, it's trivial to balance the load. Additionally, a node which lags behind is unable to commit any (conflicting) local transactions before having caught up (due to the GCS total ordering). So this is even somewhat self regulating. > Postgres-R is an asynchronous replication system by all means. It only > makes sure that the workset data (that's what Postgres-R calls the > replication log for one transaction) It's most often referred to as the "writeset". > has been received by a group > communication system supporting total order and that the group > communication system decided it to be the transaction that (logically) > happened before any possibly conflicting concurrent transaction. Correct. That's as far as the Postgres-R algorithm goes. I should have been more precise on what I'm talking about, as I'm continuing to develop Postgres-R (the software). That might be another area where a new name should be introduced to differentiate between Postgres-R, the original algorithm and my continuous work on the software, implementing the algorithm. > This is the wonderful idea how Postgres-R will have a failsafe conflict > resolution mechanism in an asynchronous system. > > I don't know what you associate with the word "eager". I'm speaking of the property, that a transaction is replicated before commit, so as to avoid later conflicts. IMO, this is the only real requirement people have when requesting synchronous replication: most people don't need synchrony, but they need reliable commit guarantees. I've noticed that you are simply speaking of a "failsafe conflict resolution mechanism". I dislike that description, because is does not say anything about *when* the conflict resolution happens WRT commit. And there may well be lazy failsafe conflict resolutions mechanisms (i.e. for a counter), which reconciliate after commit. I'd like to have a simple term, so that we could say: you probably don't need fully synchronous replication, but eager replication may already serve you well. > All I see is that > Postgres-R makes sure that some other process, which might still reside > on the same hardware as the DB, is now in charge of delivery. ..and Postgres-R waits until that other process confirms the delivery, whatever exactly that means. See below. This delay before commit is important. It is what makes Postgres-R eager, according to my definition of it. I'm open for better terms. > Nobody > said that the GC implementation cannot have made the decision about the > total order of two workset messages and already reported that to the > local client application before those messages ever got transmitted over > the wire. While this is certainly true in theory, it does not make sense in practice. It would mean letting the GCS decide on a message ordering without having delivered the messages to be ordered. That would be troublesome for the GCS, because it could loose an already ordered message. Most GCS start their ordering algorithm by sending out the message to be ordered. Anyway, as I've described on -hackers before, I'm intending to decouple replication from log writing. Thus not requiring the GCS to provide any delivery guarantees at all (GCSs are complicated enough already!). That would allow the user to decouple transaction processing nodes from log writing nodes. Those tasks have different I/O requirements anyway. And what would more that two or three replicas of the transaction logs be good for anyway? Think of them as an efficient backup - you won't need it until your complete cluster goes down. Regards Markus
JanWieck@Yahoo.com (Jan Wieck) writes: > On 9/7/2007 11:01 AM, Markus Schiltknecht wrote: >> None the less, Postgres-R is eager (or pessimistic?) in the sense >> that it replicates *before* committing, so as to avoid >> divergence. In [1] I've tried to make that distinction clear, and >> I'm currently advocating for using synchronous only in the very >> strong (and commonly used) sense. I've choosen the word 'eager' to >> mean 'replicates before committing'. >> >> According to that definitions, Postgres-R is async but eager. > > Postgres-R is an asynchronous replication system by all means. It only > makes sure that the workset data (that's what Postgres-R calls the > replication log for one transaction) has been received by a group > communication system supporting total order and that the group > communication system decided it to be the transaction that (logically) > happened before any possibly conflicting concurrent transaction. > > This is the wonderful idea how Postgres-R will have a failsafe > conflict resolution mechanism in an asynchronous system. > > I don't know what you associate with the word "eager". All I see is > that Postgres-R makes sure that some other process, which might still > reside on the same hardware as the DB, is now in charge of > delivery. Nobody said that the GC implementation cannot have made the > decision about the total order of two workset messages and already > reported that to the local client application before those messages > ever got transmitted over the wire. The approach that was going to be taken, in Slony-II, to apply locks as early as possible so as to find conflicts as soon as possible, rather than waiting, seems "eager" to me. But I'm not sure to what extent that notion has been drawn into the Postgres-R work... -- select 'cbbrowne' || '@' || 'acm.org'; http://www3.sympatico.ca/cbbrowne/slony.html Rules of the Evil Overlord #37. "If my trusted lieutenant tells me my Legions of Terror are losing a battle, I will believe him. After all, he's my trusted lieutenant." <http://www.eviloverlord.com/>
Hi, Chris Browne wrote: > The approach that was going to be taken, in Slony-II, to apply locks > as early as possible so as to find conflicts as soon as possible, > rather than waiting, seems "eager" to me. Agreed. WRT locking, one might also call it "pessimistic", but that sounds so... negative. I find the "as soon as possible" bit rather weak, instead it's exactly "before the origin node confirms commit". Of course only conflicts which could possibly lead to an abort of the transaction in question are taken into account. A possible definition may be: "Eager replication systems do only confirm the commit of a transaction after they have checked for cross-node conflicts,which could require the transaction to abort. (While lazy systems may confirm the commit before)." Note how much less restrictive that definition is, that that of a fully synchronous system. > But I'm not sure to what extent that notion has been drawn into the> Postgres-R work... My current variant of Postgres-R goes the very same path, using MVCC instead of locking wherever possible (with the very same effect, but allowing more concurrency :-) ). Regards Markus