Re: Proposal: Commit timestamp - Mailing list pgsql-hackers
From | Jan Wieck |
---|---|
Subject | Re: Proposal: Commit timestamp |
Date | |
Msg-id | 45CCD6AA.2090409@Yahoo.com Whole thread Raw |
In response to | Re: Proposal: Commit timestamp ("Andrew Hammond" <andrew.george.hammond@gmail.com>) |
List | pgsql-hackers |
On 2/9/2007 2:19 PM, Andrew Hammond wrote: > On Feb 7, 8:12 pm, b...@momjian.us (Bruce Momjian) wrote: >> Jan Wieck wrote: >> > On 2/7/2007 10:35 PM, Bruce Momjian wrote: >> > > I find the term "logical proof of it's correctness" too restrictive. It >> > > sounds like some formal academic process that really doesn't work well >> > > for us. >> >> > Thank you. > > My intuition is that it might be possible to prove that _nothing_ can > provide guaranteed ordering when there is disconnected operation. As a matter of physics, for two events happening outside of the event horizon of each other, the question which happened first is pointless. > However, I think that the clock based ordering Jan has described could > provide _probable_ ordering under disconnected operation. I can see > three variables in the equation that would determine the probability > of correctness for the ordering. That precisely is the intended functionality. And I can exactly describe when two conflicting actions will result in the "wrong" row to persist. This will happen when the second update to the logically same row will be performed on the server with the Lamport timestamp lagging behind by more than the time between the two conflicting commits. Example: User fills out a form, submits, hits back button, corrects input and submits again within 3 seconds. Load balancing sends both requests to different servers and the first server is 3.0001 seconds ahead ... the users typo will be the winner. My Lamport timestamp conflict resolution will not be able to solve this problem. However, when this happens, one thing is guaranteed. The update from the second server, arriving on the first for replication will be ignored because a locally generated row is newer. This fact can be used as an indicator that there is a possible conflict that was resolved using the wrong data (business process wise). All nodes in the cluster will end up using the same wrong row, so at least they are consistently wrong. Nevertheless, being able to identify possible problem cases this way will allow to initiate further action including but not limited to human intervention. If this is not an acceptable risk for the application, other resolution methods will be needed. But I think in many cases, this form of default resolution will be "good enough". Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-hackers by date: