Re: Proposal: Commit timestamp - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Proposal: Commit timestamp
Date
Msg-id 45C91D57.3020403@Yahoo.com
Whole thread Raw
In response to Re: Proposal: Commit timestamp  (Markus Schiltknecht <markus@bluegap.ch>)
Responses Re: Proposal: Commit timestamp  (Markus Schiltknecht <markus@bluegap.ch>)
List pgsql-hackers
On 2/6/2007 11:44 AM, Markus Schiltknecht wrote:
> Hi,
> 
> Zeugswetter Andreas ADI SD wrote:
>> And "time based"
>> is surely one of the important conflict resolution methods for async MM
>> replication.
> 
> That's what I'm questioning. Wouldn't any other deterministic, but 
> seemingly random abort decision be as clever as time based conflict 
> resolution? It would then be clear to the user that it's random and not 
> some "in most cases time based, but no in others and only if..." thing.
> 
>> Sure there are others, like "rule based" "priority based" but I think
>> you don't need additional backend functionality for those.
> 
> Got the point, yes. I'm impatient, sorry.
> 
> Neither the less, I'm questioning if is it worth adding backend 
> functionality for that. And given this probably is the most wanted 
> resolution method, this question might be "heretical". You could also 
> see it as sort of an user educating question: don't favor time based 
> resolution if that's the one resolution method with the most traps.

These are all very good suggestions towards additional conflict 
resolution mechanisms, that solve one or the other problem. As we have 
said for years now, one size will not fit all. What I am after for the 
moment is a system that supports by default a last update wins on the 
row level, where last update certainly is a little fuzzy, but not by 
minutes. Plus balance type columns. A balance column is not propagated 
as a new value, but as a delta between the old and the new value. All 
replica will apply the delta to that column regardless of whether the 
replication info is newer or older than the existing row. That way, 
literal value type columns (like an address) will maintain cluster wide 
the value of the last update to the row, while balance type columns will 
clusterwide maintain the sum of all changes.

Whatever strategy one will use, in an async multimaster there are always 
cases that can be resolved by rules (last update being one of them), and 
some that I can't even imagine solving so far. I guess some of the cases 
will simply boil down to "the application has to make sure that ... 
never occurs". Think of a multi-item order, created on one node, while 
another node is deleting the long unused item (which would have to be 
backordered). Now while those two nodes figure out what to do to make 
this consistent again, a third node does a partial shipment of that 
order. The solution is simple, reinsert the deleted item ... only that 
there were rather nasty ON DELETE CASCADE's on that item that removed 
all the consumer reviews, product descriptions, data sheets and what 
not. It's going to be an awful lot of undo.

I haven't really made up my mind about a user defined rule based 
conflict resolution interface yet. I do plan to have a unique and 
foreign key constraint based, synchronous advisory locking system on top 
of my system in a later version (advisory key locks would stay in place 
until the transaction, that placed them, replicates).

I guess you see by now why I wanted to keep the discussion about the 
individual, rather generic support features in the backend separate from 
the particular features I plan to implement in the replication system. 
Everyone has different needs and consequently an async multi-master 
"must" do a whole range of mutually exclusive things altogether ... 
because Postgres can never accept a partial solution. We want the egg 
laying milk-wool-pig or nothing.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-hackers by date:

Previous
From: Rick Gigger
Date:
Subject: Re: 10 weeks to feature freeze (Pending Work)
Next
From: Jeremy Drake
Date:
Subject: Re: Proposal: TABLE functions