Re: Proposal: Commit timestamp - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Proposal: Commit timestamp
Date
Msg-id 45C56976.7060508@Yahoo.com
Whole thread Raw
In response to Re: Proposal: Commit timestamp  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Proposal: Commit timestamp  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
On 2/3/2007 5:20 PM, Bruce Momjian wrote:
> Jan Wieck wrote:
>> I don't have any such paper and the proof of concept will be the 
>> implementation of the system. I do however see enough resistance against 
>> this proposal to withdraw the commit timestamp at this time. The new 
>> replication system will therefore require the installation of a patched, 
>> non-standard PostgreSQL version, compiled from sources cluster wide in 
>> order to be used. I am aware that this will dramatically reduce it's 
>> popularity but it is impossible to develop this essential feature as an 
>> external module.
>> 
>> I thank everyone for their attention.
> 
> Going and working on it on your own doesn't seem like the proper
> solution.  I don't see people objecting to adding it, but they want it
> work, which I am sure you want too.  You have to show how it will work
> and convince others of that, and then you have a higher chance it will
> work, and be in the PostgreSQL codebase.

Bruce,

I think I have sufficiently detailed explained how this Lamport 
timestamp will be unique and ever increasing, with the nodes ID being 
used as a tie breaker. The only thing important for "last update wins" 
conflict resolution is that whatever timestamp you have associated with 
a row, the update you do to it must be associated with a later timestamp 
so that all other nodes will overwrite the data. If a third node gets 
the two updates out of order, it will do the second nodes update and 
since the row it has then has a later timestamp then the first update 
arriving late, it will throw away that information. All nodes in sync 
again.

This is all that is needed for last update wins resolution. And as said 
before, the only reason the clock is involved in this is so that nodes 
can continue autonomously when they lose connection without conflict 
resolution going crazy later on, which it would do if they were simple 
counters. It doesn't require microsecond synchronized clocks and the 
system clock isn't just used as a Lamport timestamp.

The problem seems to me that people want a full scale proof of concept 
for the whole multimaster replication system I'm planning instead of 
thinking isolated about this one aspect, the intended use case and other 
possible uses for it (like table logging). And we all know that that 
discussion will take us way behind the 8.3 feature freeze date, so the 
whole thing will never get done.

I don't want to work on this on my own and I sure would prefer it to be 
a default PostgreSQL feature. As said, I have learned some things from 
Slony-I. One of them is that I will not go through any more ugly 
workarounds in order to not require a patched backend. If the features I 
really need aren't going to be in the default codebase, people will have 
to install from patched sources.

Finally, again, Slony-I could have well used this feature. With a 
logical commit timestamp, I would have never even thought about that 
other wart called xxid. It would have all been sooo much easier.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: custom variable classes
Next
From: Jan Wieck
Date:
Subject: Re: Referential Integrity and SHARE locks