Re: terms for database replication: synchronous vs eager - Mailing list pgsql-hackers

From Markus Schiltknecht
Subject Re: terms for database replication: synchronous vs eager
Date
Msg-id 46EA568F.6080901@bluegap.ch
Whole thread Raw
In response to Re: terms for database replication: synchronous vs eager  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
Hello Jan,

thank you for your feedback.

Jan Wieck wrote:
> On 9/7/2007 11:01 AM, Markus Schiltknecht wrote:
>> This violates the common understanding of synchrony, because you can't 
>> commit on a node A and then query another node B and expect it be 
>> coherent immediately.
> 
> That's right. And there is no guarantee about the lag at all. So you can 
> find "old" data on node B long after you committed a change to node A.

I'm in doubt about the "long after". In practice you'll mostly have 
nodes which perform about equally fast. And as the origin node has to do 
more processing, than a node which solely replays a transaction, it's 
trivial to balance the load.

Additionally, a node which lags behind is unable to commit any 
(conflicting) local transactions before having caught up (due to the GCS 
total ordering). So this is even somewhat self regulating.

> Postgres-R is an asynchronous replication system by all means. It only 
> makes sure that the workset data (that's what Postgres-R calls the 
> replication log for one transaction)

It's most often referred to as the "writeset".

> has been received by a group 
> communication system supporting total order and that the group 
> communication system decided it to be the transaction that (logically) 
> happened before any possibly conflicting concurrent transaction.

Correct. That's as far as the Postgres-R algorithm goes.

I should have been more precise on what I'm talking about, as I'm 
continuing to develop Postgres-R (the software). That might be another 
area where a new name should be introduced to differentiate between 
Postgres-R, the original algorithm and my continuous work on the 
software, implementing the algorithm.

> This is the wonderful idea how Postgres-R will have a failsafe conflict 
> resolution mechanism in an asynchronous system.
> 
> I don't know what you associate with the word "eager".

I'm speaking of the property, that a transaction is replicated before 
commit, so as to avoid later conflicts. IMO, this is the only real 
requirement people have when requesting synchronous replication: most 
people don't need synchrony, but they need reliable commit guarantees.

I've noticed that you are simply speaking of a "failsafe conflict 
resolution mechanism". I dislike that description, because is does not 
say anything about *when* the conflict resolution happens WRT commit. 
And there may well be lazy failsafe conflict resolutions mechanisms 
(i.e. for a counter), which reconciliate after commit.

I'd like to have a simple term, so that we could say: you probably don't 
need fully synchronous replication, but eager replication may already 
serve you well.

> All I see is that 
> Postgres-R makes sure that some other process, which might still reside 
> on the same hardware as the DB, is now in charge of delivery. 

..and Postgres-R waits until that other process confirms the delivery, 
whatever exactly that means. See below.

This delay before commit is important. It is what makes Postgres-R 
eager, according to my definition of it. I'm open for better terms.

> Nobody 
> said that the GC implementation cannot have made the decision about the 
> total order of two workset messages and already reported that to the 
> local client application before those messages ever got transmitted over 
> the wire.

While this is certainly true in theory, it does not make sense in 
practice. It would mean letting the GCS decide on a message ordering 
without having delivered the messages to be ordered. That would be 
troublesome for the GCS, because it could loose an already ordered 
message. Most GCS start their ordering algorithm by sending out the 
message to be ordered.

Anyway, as I've described on -hackers before, I'm intending to decouple 
replication from log writing. Thus not requiring the GCS to provide any 
delivery guarantees at all (GCSs are complicated enough already!). That 
would allow the user to decouple transaction processing nodes from log 
writing nodes. Those tasks have different I/O requirements anyway. And 
what would more that two or three replicas of the transaction logs be 
good for anyway? Think of them as an efficient backup - you won't need 
it until your complete cluster goes down.

Regards

Markus



pgsql-hackers by date:

Previous
From: Guillaume Lelarge
Date:
Subject: errcontext function
Next
From: Bruce Momjian
Date:
Subject: Re: tsearch2 documentation done