Re: Proposal: Commit timestamp - Mailing list pgsql-hackers

From Markus Schiltknecht
Subject Re: Proposal: Commit timestamp
Date
Msg-id 45C981D7.7090908@bluegap.ch
Whole thread Raw
In response to Re: Proposal: Commit timestamp  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: Proposal: Commit timestamp  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
Hi,

Jan Wieck wrote:
> Whatever strategy one will use, in an async multimaster there are always 
> cases that can be resolved by rules (last update being one of them), and 
> some that I can't even imagine solving so far. I guess some of the cases 
> will simply boil down to "the application has to make sure that ... 
> never occurs". Think of a multi-item order, created on one node, while 
> another node is deleting the long unused item (which would have to be 
> backordered). Now while those two nodes figure out what to do to make 
> this consistent again, a third node does a partial shipment of that 
> order.

It helps to categorize these conflict types. There basically are:

* data conflicts: simple row data, i.e. update - update conflicts.

* uniqueness conflicts: two rows conflict because they'd violate a 
uniquenes constraint, i.e. insert - insert, update - insert or update - 
update.

* visibility conflicts: basically the remaining update - delete and 
delete - delete cases. But also SELECT FOR UPDATE candidates, etc... 
Everything having to do with a rows not yet or no longer being visible 
to a transaction.

Your example certainly involves a visibility conflict (update - delete). 
Not even (sync) Postgres-R can guarantee consistency on the visibility 
level, i.e.  a first transaction's SELECT FOR UPDATE might not see some 
just recently committed transactions newly inserted rows (because that 
one isn't replayed yet on the node, thus the transaction is working on 
an 'old' snapshot of the database state). Another simpler example: 
Postgres-R doesn't raise a serialization error on delete-delete 
conflicts, it simply deletes the row once, even if two transactions 
confirmed to have committed a transaction which deleted a row.

Luckily, most applications don't need that anyway, though.

> The solution is simple, reinsert the deleted item ... 

..at which point timestamps certainly won't help :-)   Sorry, couldn't 
resist...

> only that 
> there were rather nasty ON DELETE CASCADE's on that item that removed 
> all the consumer reviews, product descriptions, data sheets and what 
> not. It's going to be an awful lot of undo.

Huh? Are you planning on aborting *parts* of a transaction? I didn't 
think about that, but my gut feeling is that you don't want to do that.

> I haven't really made up my mind about a user defined rule based 
> conflict resolution interface yet. I do plan to have a unique and 
> foreign key constraint based, synchronous advisory locking system on top 
> of my system in a later version (advisory key locks would stay in place 
> until the transaction, that placed them, replicates).

You'd have to elaborate on that...

> I guess you see by now why I wanted to keep the discussion about the 
> individual, rather generic support features in the backend separate from 
> the particular features I plan to implement in the replication system.

Sure. I know, discussions about replication can get endless, probably 
even are so by definition ;-)  But hey, they're fun!

> Everyone has different needs and consequently an async multi-master 
> "must" do a whole range of mutually exclusive things altogether ... 
> because Postgres can never accept a partial solution. We want the egg 
> laying milk-wool-pig or nothing.

Like the one which would result from a merge of such an async 
replication with a sync one? Imagine being able to choose between sync 
and async per transaction...

Regards

Markus



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Proposal: Commit timestamp
Next
From: Jeremy Drake
Date:
Subject: Re: [PATCHES] writing new regexp functions