Re: BDR and TX obeyance - Mailing list pgsql-general

From Craig Ringer
Subject Re: BDR and TX obeyance
Date
Msg-id CAMsr+YE7Q4fQ5cpEAacYck+7bCGwY_LRcQG7w-Soeg+0shMrxg@mail.gmail.com
Whole thread Raw
In response to BDR and TX obeyance  (Riley Berton <rberton@appnexus.com>)
Responses Re: BDR and TX obeyance  (Riley Berton <rberton@appnexus.com>)
List pgsql-general
On 5 January 2016 at 04:09, Riley Berton <rberton@appnexus.com> wrote:

The conflict on the "thingy" table has resulted in node2 winning based
on last_update wins default resolution.  However, both inserts have
applied.  My expectation is that the entire TX applies or does not
apply.  This expectation is clearly wrong.

Correct. Conflicts are resolved row-by-row. Their outcomes are determined (by default) by transaction commit timestamps, but the conflicts themselves are row-by-row.

Because BDR:

* applies changes to other nodes only AFTER commit on the origin node; and
* does not take row and table locks across nodes

it has no way to sensibly apply all or none of a transaction on downstream peers because the client has already committed and moved on to other things. If the xact doesn't apply, what do we do? Log output on the failing node(s) and throw it away?

It's probably practical to have xacts abort on the first conflict, though some thought would be needed about making sure that doesn't break consistency requirements across nodes. It's not clear if doing so is useful though.

For that you IMO want synchronous replication where the client doesn't get a local COMMIT until all nodes have confirmed they can commit the xact. That's something that could be added to BDR in future, but doing it well it requires support for logical decoding of prepared transactions which is currently missing from PostgreSQL's logical decoding support. If it's something you think is important/useful you might want to explore what's involved in implementing that.

Question is: is there a way (via a custom conflict handler) to have the
TX obeyed?

No.

Even if you ERROR in your handler, BDR will just retry the xact. It has no concept of "throw this transaction away forever".
 
I can't see a way to even implement a simple bank account
database that changes multiple tables in a single transaction without
having the data end up in an inconsistent state.  Am I missing something
obvious here?

You're trying to use asynchronous multimaster replication as if it was an application-transparent synchronous cluster with a global transaction manager and global lock manager.

BDR is not application-transparent. You need to understand replication conflicts and think about them. It does not preserve full READ COMMITTED semantics across nodes. This comes with big benefits in partition tolerance, performance and latency tolerance, but it means you can't point an existing app at more than one node and expect it to work properly.
 
The documentation tries over and over to emphasise this. Can you suggest where it can be made clearer or more prominent? 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-general by date:

Previous
From: Sachin Srivastava
Date:
Subject: Re: Function error
Next
From: Pavel Stehule
Date:
Subject: Re: Function error