Re: eXtensible Transaction Manager API - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: eXtensible Transaction Manager API
Date
Msg-id CAMsr+YEDA2gb380_i6O5SQeRiwqus0CjCLUQDnGj5teQi1DYHg@mail.gmail.com
Whole thread Raw
In response to Re: eXtensible Transaction Manager API  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: eXtensible Transaction Manager API  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-hackers
On 13 November 2015 at 21:35, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Nov 10, 2015 at 3:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Nov 8, 2015 at 6:35 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Sure. Now imagine that the pg_twophase entry is corrupted for this
>> transaction on one node. This would trigger a PANIC on it, and
>> transaction would not be committed everywhere.
>
> If the database is corrupted, there's no way to guarantee that
> anything works as planned.  This is like saying that criticizing
> somebody's disaster recovery plan on the basis that it will be
> inadequate if the entire planet earth is destroyed.

As well as there could be FS, OS, network problems... To come back to
the point, my point is simply that I found surprising the sentence of
Konstantin upthread saying that if commit fails on some of the nodes
we should rollback the prepared transaction on all nodes. In the
example given, in the phase after calling dtm_end_prepare, say we
perform COMMIT PREPARED correctly on node 1, but then failed it on
node 2 because a meteor has hit a server, it seems that we cannot
rollback, instead we had better rolling in a backup and be sure that
the transaction gets committed. How would you rollback the transaction
already committed on node 1? But perhaps I missed something...

The usual way this works in an XA-like model is:

In phase 1 (prepare transaction, in Pg's spelling), failure on any node triggers a rollback on all nodes.

In phase 2 (commit prepared), failure on any node causes retries until it succeeds, or until the admin intervenes - say, to remove that node from operation. The global xact as a whole isn't considered successful until it's committed on all nodes.

2PC and distributed commit is well studied, including the problems. We don't have to think this up for ourselves. We don't have to invent anything here. There's a lot of distributed systems theory to work with - especially when dealing with well studied relational DBs trying to maintain ACID semantics.

Not to say that there aren't problems with the established ways. The XA API is horrific. Java's JTA follows it too closely, and whoever thought that HeuristicMixedException was a good idea.... augh.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_stat_statements query jumbling question
Next
From: Craig Ringer
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive