Re: Transactions involving multiple postgres foreign servers - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Transactions involving multiple postgres foreign servers
Date
Msg-id CA+TgmoaW1YhaCDtsMRX9V99=zCfB0kv5hpJOj66q0LEq0a=wVQ@mail.gmail.com
Whole thread Raw
In response to Re: Transactions involving multiple postgres foreign servers  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: Transactions involving multiple postgres foreign servers
List pgsql-hackers
Overall, you seem to have made some significant progress on the design
since the last version of this patch.  There's probably a lot left to
do, but the design seems more mature now.  I haven't read the code,
but here are some comments based on the email.

On Thu, Jul 9, 2015 at 6:18 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The patch introduces a GUC atomic_foreign_transaction, which when ON ensures
> atomic commit for foreign transactions, otherwise not. The value of this GUC
> at the time of committing or preparing a local transaction is used. This
> gives applications the flexibility to choose the behaviour as late in the
> transaction as possible. This GUC has no effect if there are no foreign
> servers involved in the transaction.

Hmm.  I'm not crazy about that name, but I don't have a better idea either.

One thing about this design is that it makes atomicity a property of
the transaction rather than the server.  That is, any given
transaction is either atomic with respect to all servers or atomic
with respect to none.  You could also design this the other way: each
server is either configured to do atomic commit, or not.  When a
transaction is committed, it prepares on those servers which are
configured for it, and then commits the others.  So then you can have
a "partially atomic" transaction where, for example, you transfer
money from account A to account B (using one or more FDW connections
that support atomic commit) and also use twitter_fdw to tweet about it
(using an FDW connection that does NOT support atomic commit).  The
tweet will survive even if the local commit fails, but that's OK.  You
could even do this at table granularity: we'll prepare the transaction
if at least one foreign table involved in the transaction has
atomic_commit = true.

In some sense I think this might be a nicer design, because suppose
you connect to a foreign server and mostly just log stuff but
occasionally do important things there.  In your design, you can do
this, but you'll need to make sure atomic_foreign_transaction is set
for the correct set of transactions.  But in what I'm proposing here
we might be able to derive the correct value mostly automatically.

We should consider other possible designs as well; the choices we make
here may have a significant impact on usability.

> Another GUC max_fdw_transactions sets the maximum number of transactions
> that can be simultaneously prepared on all the foreign servers. This limits
> the memory required for remembering the prepared foreign transactions.

How about max_prepared_foreign_transactions?

> Two new FDW hooks are introduced for transaction management.
> 1. GetPrepareId: to get the prepared transaction identifier for a given
> foreign server connection. An FDW which doesn't want to support this feature
> can keep this hook undefined (NULL). When defined the hook should return a
> unique identifier for the transaction prepared on the foreign server. The
> identifier should be unique enough not to conflict with currently prepared
> or future transactions. This point will be clear when discussing phase 2 of
> 2PC.
>
> 2. HandleForeignTransaction: to end a transaction in specified way. The hook
> should be able to prepare/commit/rollback current running transaction on
> given connection or commit/rollback a previously prepared transaction. This
> is described in detail while describing phase two of two-phase commit. The
> function is required to return a boolean status of whether the requested
> operation was successful or not. The function or its minions should not
> raise any error on failure so as not to interfere with the distributed
> transaction processing. This point will be clarified more in the description
> below.

HandleForeignTransaction is not very descriptive, and I think you're
jamming together things that ought to be separated.  Let's have a
PrepareForeignTransaction and a ResolvePreparedForeignTransaction.

> A foreign server, user mapping corresponding to an unresolved foreign
> transaction is not allowed to be dropped or altered until the foreign
> transaction is resolved. This is required to retain the connection
> properties which need to resolve the prepared transaction on the foreign
> server.

I agree with not letting it be dropped, but I think not letting it be
altered is a serious mistake.  Suppose the foreign server dies in a
fire, its replica is promoted, and we need to re-point the master at
the replica's hostname or IP.

> Handling non-atomic foreign transactions
> ===============================
> When atomic_foreign_transaction is disabled, one-phase commit protocol is
> used to commit/rollback the foreign transactions. After the local
> transaction has committed/aborted, all the foreign transactions on the
> registered foreign connections are committed or aborted resp. using hook
> HandleForeignTransaction. Failing to commit a foreign transaction does not
> affect the other foreign transactions; they are still tried to be committed
> (if the local transaction commits).

Is this a change from the current behavior?  What if we call the first
commit handler and it throws an ERROR?  Presumably then nothing else
gets committed, and the transaction overall aborts.

> PITR
> ====
> PITR may rewind the database to a point before an xid associated with an
> unresolved foreign transaction. There are two approaches to deal with the
> situation.
> 1. Just forget about the unresolved foreign transaction and remove the file
> just like we do for a prepared local transaction. But then the prepared
> transaction on the foreign server might be left unresolved forever and will
> keep holding the resources.
> 2. Do not allow PITR to such point. We can not get rid of the transaction id
> without getting rid of prepared foreign transaction. If we do so, we might
> create conflicting files in future and might resolve the transaction with
> wrong outcome.

I don't think either of these is correct.  The database shouldn't
behave differently when PITR is used than when it isn't.  Otherwise
you are not doing what it says on the tin: recovering to the chosen
point in time.  I recommend adding a function that forgets about a
foreign prepared transaction and making it the DBA's job to figure out
whether to call it in a particular scenario.  After all, the remote
machine might have been subjected to PITR, too.  Or maybe not.  We
can't know, so we should give the DBA the tools to clean things up and
leave it at that.

> IIUC LRO, the patch uses the local transaction as last resource, which is
> always present. The fate of foreign transaction is decided by the fate of
> the local transaction, which is not required to be prepared per say. There
> is more relevant note later.

Personally, I think that's perfectly fine.  We could do more later if
we wanted to, but there's plenty to like here without that.

>> Just to be clear: you also need two-phase commit if the transaction
>> updated anything in the local server and in even one foreign server.
>
> Any local transaction involving a foreign sever transaction uses two-phase
> commit for the foreign transaction. The local transaction is not prepared
> per say. However, we should be able to optimize a case, when there are no
> local changes. I am not able to find a way to deduce that there was no local
> change, so I have left that case in this patch. Is there a way to know
> whether a local transaction changed something locally or not?

You might check whether it wrote any WAL.  There's a global variable
for that somewhere; RecordTransactionCommit() uses it.  But I don't
think this is an essential optimization for v1, either.

> I have used approach similar to pg_twophase, but implemented it as a
> separate code, as the requirements differ. But, I would like to minimize
> code by unifying both, if we finalise this design. Suggestions in this
> regard will be very helpful.

-1 for trying to unify those unless it's really clear that it's a good
idea.  I bet it's not.

>> Or you could insert/update the rows in the catalog with xmin=FrozenXid,
>> ignoring MVCC. Not sure how well that would work.
>
> I am not aware how to do that. Do we have any precedence in the code.

No.  I bet that's also a bad idea.  A non-transactional table is a
good idea that has been proposed before, but let's not try to invent
it in this patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [PATCH] Generalized JSON output functions
Next
From: Robert Haas
Date:
Subject: Re: Retrieve the snapshot's LSN