Thread: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Justin Bertram
Date:
I searched the mailing list, but I couldn't find anything addressing this issue.

I work with JBoss Transactions (JBossTS) and I noticed something recently in
org.postgresql.xa.PGXAConnection.commitPrepared(Xidxid) [1] when working with some transaction recovery scenarios after
adatabase failure.  commitPrepared calls: 

  throw new XAException(ex.toString());

The Java XA interface assumes that a thrown javax.transaction.xa.XAException [2] will contain one of the standard XA
errorcodes [3] to identify exactly the nature of the error. Unfortunately Java allows an XAException to be constructed
usinga default constructor and a String constructor.  Using these constructors results in an errorCode value of 0 for
theXAException which is not valid.  

In the latest PostgreSQL JDBC3 driver (8.4-701) this is what org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)
doesso the exception cannot be handled cleanly. JBossTS does not know what this error signifies so it has to assume the
worst(i.e. that the resource is in an indeterminate state and it should not attempt to recover the transaction). Any
resetof the database state will require manual intervention. 

Can this be changed to throw an XAException with the appropriate XAER error code?


Justin

[1]
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/xa/PGXAConnection.java?rev=1.13&content-type=text/x-cvsweb-markup&only_with_tag=REL8_4_701
[2] http://java.sun.com/javaee/5/docs/api/javax/transaction/xa/XAException.html
[3] http://java.sun.com/javaee/5/docs/api/constant-values.html#javax.transaction.xa.XAException.XA_HEURCOM

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Heikki Linnakangas
Date:
Justin Bertram wrote:
> I work with JBoss Transactions (JBossTS) and I noticed something recently in
org.postgresql.xa.PGXAConnection.commitPrepared(Xidxid) [1] when working with some transaction recovery scenarios after
adatabase failure.  commitPrepared calls: 
>
>   throw new XAException(ex.toString());
>
> The Java XA interface assumes that a thrown javax.transaction.xa.XAException [2] will contain one of the standard XA
errorcodes [3] to identify exactly the nature of the error. Unfortunately Java allows an XAException to be constructed
usinga default constructor and a String constructor.  Using these constructors results in an errorCode value of 0 for
theXAException which is not valid.  
>
> In the latest PostgreSQL JDBC3 driver (8.4-701) this is what org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)
doesso the exception cannot be handled cleanly. JBossTS does not know what this error signifies so it has to assume the
worst(i.e. that the resource is in an indeterminate state and it should not attempt to recover the transaction). Any
resetof the database state will require manual intervention. 
>
> Can this be changed to throw an XAException with the appropriate XAER error code?

Yeah, seems like an oversight. Patch attached. We use XAER_RMERR error
code in similar places in all the other functions. Does that work for JBoss?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: org/postgresql/xa/PGXAConnection.java
===================================================================
RCS file: /cvsroot/jdbc/pgjdbc/org/postgresql/xa/PGXAConnection.java,v
retrieving revision 1.13
diff -u -r1.13 PGXAConnection.java
--- org/postgresql/xa/PGXAConnection.java    14 Nov 2007 22:03:36 -0000    1.13
+++ org/postgresql/xa/PGXAConnection.java    23 Dec 2009 08:34:19 -0000
@@ -441,7 +441,7 @@
         }
         catch (SQLException ex)
         {
-            throw new XAException(ex.toString());
+            throw new PGXAException(GT.tr("Error committing prepared transaction"), ex, XAException.XAER_RMERR);
         }
     }



On Wed, 23 Dec 2009, Heikki Linnakangas wrote:

>> Can this be changed to throw an XAException with the appropriate XAER
>> error code?
>
> Yeah, seems like an oversight. Patch attached.
>

Applied to CVS back to 8.1.

Kris Jurka

I'm bringing this back up because, while the XAER_RMERR works in most cases it fails in at least one.

Consider the scenario where the database is shutdown administratively during
org.postgresql.xa.PGXAConnection.commitPrepared(Xidxid).  The driver will throw an XAException with an errorCode of
XAER_RMERRback to the transaction manager.  However, the pg_prepared_xacts table will still contain a row for the
transaction.  

The method org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid) is invoked by the transaction manager as part of
itscall to javax.transaction.xa.XAResource.commit(..) [1].  This is the JTA mapping of the xa_commit() function from
theXA specification [2].  According to this document, a return of XAER_RMERR means: 

  An error occurred in committing the work performed on behalf of the transaction
  branch and the branch’s work has been rolled back. Note that returning this error
  signals a catastrophic event to a transaction manager since other resource
  managers may successfully commit their work on behalf of this branch. This error
  should be returned only when a resource manager concludes that it can never
  commit the branch and that it cannot hold the branch’s resources in a prepared
  state. Otherwise, [XA_RETRY] should be returned.

However, since the pg_prepared_xacts table still contains a row for the transaction the XAER_RMERR is not accurate.  A
"catastrophic"failure has not occurred.  It would be possible for the transaction manager to recover this transaction
oncethe database is available again if XA_RETRY was returned instead. 

I think it would be better if commitPrepared could differentiate between errors and return either XAER_RMERR or
XA_RETRYas appropriate.  Otherwise just about any failure during commitPrepared will result in unrecoverable
transactionsand require manual intervention to clean up the pg_prepared_xacts table. 



[1] http://download.oracle.com/javaee/5/api/javax/transaction/xa/XAResource.html#commit(javax.transaction.xa.Xid,
boolean)
[2] http://pubs.opengroup.org/onlinepubs/009680699/toc.pdf

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Heikki Linnakangas
Date:
On 29.06.2011 01:52, Justin Bertram wrote:
> I'm bringing this back up because, while the XAER_RMERR works in most cases it fails in at least one.
>
> Consider the scenario where the database is shutdown administratively during
org.postgresql.xa.PGXAConnection.commitPrepared(Xidxid).  The driver will throw an XAException with an errorCode of
XAER_RMERRback to the transaction manager.  However, the pg_prepared_xacts table will still contain a row for the
transaction.
>
> The method org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid) is invoked by the transaction manager as part of
itscall to javax.transaction.xa.XAResource.commit(..) [1].  This is the JTA mapping of the xa_commit() function from
theXA specification [2].  According to this document, a return of XAER_RMERR means: 
>
>    An error occurred in committing the work performed on behalf of the transaction
>    branch and the branch’s work has been rolled back. Note that returning this error
>    signals a catastrophic event to a transaction manager since other resource
>    managers may successfully commit their work on behalf of this branch. This error
>    should be returned only when a resource manager concludes that it can never
>    commit the branch and that it cannot hold the branch’s resources in a prepared
>    state. Otherwise, [XA_RETRY] should be returned.
>
> However, since the pg_prepared_xacts table still contains a row for the transaction the XAER_RMERR is not accurate.
A"catastrophic" failure has not occurred.  It would be possible for the transaction manager to recover this transaction
oncethe database is available again if XA_RETRY was returned instead. 
>
> I think it would be better if commitPrepared could differentiate between errors and return either XAER_RMERR or
XA_RETRYas appropriate.  Otherwise just about any failure during commitPrepared will result in unrecoverable
transactionsand require manual intervention to clean up the pg_prepared_xacts table. 

Yes, good catch, we should be more careful to use the right error code.
For starters we could detect errors caused by failed connection, which
would include the case of server shutdown, and throw XA_RETRY. But I
guess we should be conservative here, and return XA_RETRY for anything
else than cases where we know for sure that the prepared transaction is
not there anymore. If COMMIT PREPARED is called for a non-existing
global transaction id, you get the error "prepared transaction with
identifier \"%s\" does not exist", with sqlstate 42704
(ERRCODE_UNDEFINED_OBJECT). We use ERRCODE_UNDEFINED_OBJECT for a lot of
things in the backend, but when you're issuing COMMIT PREPARED, I think
it's reasonable to assume that if you get that error code, the prepared
transaction is missing.

Attached patch implements that. I don't have a transaction manager
environment for this at hand right now, so this is completely untested.
Do you have a setup ready where you could test this?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Heikki Linnakangas
Date:
On 29.06.2011 16:04, Jesper Pedersen wrote:
> Hey Heikki,
>
> On Wednesday, June 29, 2011 05:26:51 AM Heikki Linnakangas wrote:
>> Attached patch implements that. I don't have a transaction manager
>> environment for this at hand right now, so this is completely untested.
>> Do you have a setup ready where you could test this?
>
> You forgot to update the actual exception with 'xaerrcode'.

Oops. Here's a patch with that fixed.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment
I had to adjust the patch so that the new variable xaerrcode was passed into the PGXAException constructor (on line 522
ofthe patched code) rather than XAException.XAER_RMERR.  Once that was done recovery worked just as expected.  Thanks
foryour work on this! 


Justin

----- Original Message -----
From: "Heikki Linnakangas" <heikki.linnakangas@enterprisedb.com>
To: "Justin Bertram" <jbertram@redhat.com>
Cc: pgsql-jdbc@postgresql.org
Sent: Wednesday, June 29, 2011 4:26:51 AM
Subject: Re: [JDBC] Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

On 29.06.2011 01:52, Justin Bertram wrote:
> I'm bringing this back up because, while the XAER_RMERR works in most cases it fails in at least one.
>
> Consider the scenario where the database is shutdown administratively during
org.postgresql.xa.PGXAConnection.commitPrepared(Xidxid).  The driver will throw an XAException with an errorCode of
XAER_RMERRback to the transaction manager.  However, the pg_prepared_xacts table will still contain a row for the
transaction.
>
> The method org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid) is invoked by the transaction manager as part of
itscall to javax.transaction.xa.XAResource.commit(..) [1].  This is the JTA mapping of the xa_commit() function from
theXA specification [2].  According to this document, a return of XAER_RMERR means: 
>
>    An error occurred in committing the work performed on behalf of the transaction
>    branch and the branch’s work has been rolled back. Note that returning this error
>    signals a catastrophic event to a transaction manager since other resource
>    managers may successfully commit their work on behalf of this branch. This error
>    should be returned only when a resource manager concludes that it can never
>    commit the branch and that it cannot hold the branch’s resources in a prepared
>    state. Otherwise, [XA_RETRY] should be returned.
>
> However, since the pg_prepared_xacts table still contains a row for the transaction the XAER_RMERR is not accurate.
A"catastrophic" failure has not occurred.  It would be possible for the transaction manager to recover this transaction
oncethe database is available again if XA_RETRY was returned instead. 
>
> I think it would be better if commitPrepared could differentiate between errors and return either XAER_RMERR or
XA_RETRYas appropriate.  Otherwise just about any failure during commitPrepared will result in unrecoverable
transactionsand require manual intervention to clean up the pg_prepared_xacts table. 

Yes, good catch, we should be more careful to use the right error code.
For starters we could detect errors caused by failed connection, which
would include the case of server shutdown, and throw XA_RETRY. But I
guess we should be conservative here, and return XA_RETRY for anything
else than cases where we know for sure that the prepared transaction is
not there anymore. If COMMIT PREPARED is called for a non-existing
global transaction id, you get the error "prepared transaction with
identifier \"%s\" does not exist", with sqlstate 42704
(ERRCODE_UNDEFINED_OBJECT). We use ERRCODE_UNDEFINED_OBJECT for a lot of
things in the backend, but when you're issuing COMMIT PREPARED, I think
it's reasonable to assume that if you get that error code, the prepared
transaction is missing.

Attached patch implements that. I don't have a transaction manager
environment for this at hand right now, so this is completely untested.
Do you have a setup ready where you could test this?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

On Wed, Jun 29, 2011 at 8:51 PM, Justin Bertram <jbertram@redhat.com> wrote:

> I had to adjust the patch so that the new variable xaerrcode was passed into the PGXAException constructor (on line
522of the patched code) rather than XAException.XAER_RMERR.  Once that was done recovery worked just as expected.
 Thanksfor your work on this! 

Interesting. We've been looking at exactly that error as well and were
just about to submit a similar, but different patch.

My understanding is that if the TM requests rollback of a transaction
and then the database crashes before it can reply, the TM may request
rollback a second time. If the first rollback did actually remove the
transaction this then replies that the transaction is unknown. So
XAER_RMERR is exactly the wrong error in some cases of 42704.

Thoughts?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Heikki Linnakangas
Date:
On 29.06.2011 23:07, Simon Riggs wrote:
> My understanding is that if the TM requests rollback of a transaction
> and then the database crashes before it can reply, the TM may request
> rollback a second time. If the first rollback did actually remove the
> transaction this then replies that the transaction is unknown. So
> XAER_RMERR is exactly the wrong error in some cases of 42704.

Well, if the transaction has been rolled back, it's clearly not in
prepared state anymore, so XAER_RMERR seems appropriate and the TM
shouldn't get upset about that. However, the situation is more ambiguous
if the TM issues COMMIT PREPARED and the connection is broken before
receiving a reply. It will retry, and if the the COMMIT succeeded the
first time, the TM will get XAER_RMERR on the second call. That's more
serious.

I believe some (all?) other databases have yet another transaction state
to handle that. After committing a transaction, the database still
remembers the transaction ID and the fact that it was successfully
committed. If the TM doesn't receive a reply to commit, it can reconnect
and check the status to see if the commit succeeded. The transaction is
forgotten about only after the TM sends another 'forget' command. Maybe
we need to implement that in PostgreSQL, too.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
"Johann 'Myrkraverk' Oskarsson"
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

> Well, if the transaction has been rolled back, it's clearly not in
> prepared state anymore, so XAER_RMERR seems appropriate and the TM
> shouldn't get upset about that. However, the situation is more
> ambiguous if the TM issues COMMIT PREPARED and the connection is
> broken before receiving a reply. It will retry, and if the the
> COMMIT succeeded the first time, the TM will get XAER_RMERR on the
> second call. That's more serious.

There is no reason to return an error on rollback if the transaction
is unknown.  If the DBMS server does not remember the transaction the
transaction manager can assume it has already been rolled back.  That
is why it is appropriate to silently ignore 42704 or rollback.  The
following patch does this.

This patch fixes a failure in a testsuite from Atomikos.

>
> I believe some (all?) other databases have yet another transaction
> state to handle that. After committing a transaction, the database
> still remembers the transaction ID and the fact that it was
> successfully committed. If the TM doesn't receive a reply to commit,
> it can reconnect and check the status to see if the commit
> succeeded. The transaction is forgotten about only after the TM
> sends another 'forget' command. Maybe we need to implement that in
> PostgreSQL, too.

Yes, this is needed for heuristics.  That is, if a transaction manager
goes down and a system administrator commits the transaction before it
comes up again the server should not report `Undefined Object' but an
error on the lines of `Transaction Already Committed.'  This is a
feature for the server and there is nothing[*] in the JDBC driver we
can do to implement this.

[*] Nothing is a strong word, it is always possible to implement a way
to store the Xids in a table.  The driver should not do that without
some sort of confirmation from the application though.


--
   Johann Oskarsson                http://www.2ndquadrant.com/    |[]
   PostgreSQL Development, 24x7 Support, Training and Services  --+--
                                                                  |
   Blog: http://my.opera.com/myrkraverk/blog/


Attachment

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Radosław Smogura
Date:
"Johann 'Myrkraverk' Oskarsson" <johann@2ndquadrant.com> Tuesday 05 of July
2011 00:02:38
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> > Well, if the transaction has been rolled back, it's clearly not in
> > prepared state anymore, so XAER_RMERR seems appropriate and the TM
> > shouldn't get upset about that. However, the situation is more
> > ambiguous if the TM issues COMMIT PREPARED and the connection is
> > broken before receiving a reply. It will retry, and if the the
> > COMMIT succeeded the first time, the TM will get XAER_RMERR on the
> > second call. That's more serious.
>
> There is no reason to return an error on rollback if the transaction
> is unknown.  If the DBMS server does not remember the transaction the
> transaction manager can assume it has already been rolled back.  That
> is why it is appropriate to silently ignore 42704 or rollback.  The
> following patch does this.
>
> This patch fixes a failure in a testsuite from Atomikos.
I may be wrong, but may VACUUM remove transactions statuses? What will happen
in this situation if transaction will be prepared, commited or rolledback?

Regards,
Radek

Re: Possible oversight in org.postgresql.xa.PGXAConnection.commitPrepared(Xid xid)

From
Heikki Linnakangas
Date:
On 05.07.2011 09:12, Radosław Smogura wrote:
> I may be wrong, but may VACUUM remove transactions statuses? What will happen
> in this situation if transaction will be prepared, commited or rolledback?

No. VACUUM does remove old clog, if that's what you're thinking, but it
won't affect this.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com