Thread: Incorrect response code after XA recovery
Hi, I would like to consult with you a problematic response put by PostgreSQL after transaction recovery run by Narayana (JBossTS). I work on tests for Narayana and I hit a issue with PostgreSQL. The db returns incorrect code XAException.XA_HEURHAZ whenthe TM does recovery after crash of the jboss eap app server. The exception is following: Caused by: org.postgresql.util.PSQLException: ERROR: prepared transaction with identifier "131072_AAAAAAAAAAAAAP//fwAAAd7TXOBR8jj5AAAAKDE=_AAAAAAAAAAAAAP//fwAAAd7TXOBR8jj5AAAALQAAAAAAAAAA"does not exist It's run on PostgreSQL 9.2 but the older versions seem to be affected as well. The problem occurs when TM runs on JTS transactions. The idea of the test: The test enlists two resources to a transaction. There is called prepare on resource of PostgreSQL. The app server crashesbefore prepare is called on second transaction participant. After restart of the app server TM tries to recover thetransaction. As the fail occurs during prepare phase rollback is expected. The OTS specification requires both bottom up and top down recovery to be triggered by the recovering resource. This causesthat two rollback calls are done against the DB. DB receives rollback call and does the rollback. Then for the secondtime it returns the exceptional code. As the DB already rollbacked the transaction and forgot about it the DB returnserror that no such transaction exists. But this seems to be against OTS specification. There are some more details in the following bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=988724 Do you have some experience with such behaviour? Can I suppose this being problem of PostgreSQL? Or is there already somebug for this issue in Postgres bugtracking system? Thank you Ondra
Ondrej Chaloupka <ochaloup@redhat.com> writes: > The OTS specification requires both bottom up and top down recovery to be triggered by the recovering resource. This causesthat two rollback calls are done against the DB. DB receives rollback call and does the rollback. Then for the secondtime it returns the exceptional code. As the DB already rollbacked the transaction and forgot about it the DB returnserror that no such transaction exists. But this seems to be against OTS specification. It's not likely that we would consider changing the behavior of ROLLBACK PREPARED. The alternatives we would have are (1) silently accept a ROLLBACK against a non-existent transaction ID, or (2) remember every rolled-back ID forever. Neither seems sane in the least. It seems to me that this is something client-side code, probably the XA manager, would need to deal with. The XA manager already has to track uncommitted 2-phase transactions, and would furthermore have the best idea of when it would be safe to forget about a rolled-back ID. Right offhand it appears to me that that Red Hat bug is filed against the correct component, and you need to push them harder to fix their bug/shortcoming rather than claim it's our problem. regards, tom lane
Tom Jenkinson <tom.jenkinson@redhat.com> writes: > A little bit of information in the linked bugzilla report is that the > exception being returned has an XA error code of XAER_RMERR "An error > occurred in rolling back the transaction branch. The resource manager is > free to forget about the branch when returning this error so long as all > accessing threads of control have been notified of the branch�s state." > That does not sound right to me, wouldn't XAER_NOTA "The specified XID > is not known by the resource manager" be more accurate? No idea, but in any case that's outside Postgres' purview. It's barely possible that the Postgres JDBC driver has something to do with that, but it sounds more like the XA manager's turf. regards, tom lane
Tom Jenkinson <tom.jenkinson@redhat.com> writes: > On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >> No idea, but in any case that's outside Postgres' purview. It's barely >> possible that the Postgres JDBC driver has something to do with that, >> but it sounds more like the XA manager's turf. > I am not sure what you mean here as I don't know the structure of how > the PostGres project is packaged, all I know is that the PostGres JDBC > driver component appears to be returning an XAException with the > message "Error rolling back prepared transaction" and an errorCode of > XAException.XAER_RMERR rather than XAER_NOTA. > Is there a different component within your bug tracking system we > should be using to raise this against the JDBC driver instead? The folk who would fix anything in the JDBC driver tend to read pgsql-jdbc sooner than pgsql-bugs, so cc'ing there for comment. regards, tom lane
Hi Tom, On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: > Tom Jenkinson <tom.jenkinson@redhat.com> writes: >> A little bit of information in the linked bugzilla report is that the >> exception being returned has an XA error code of XAER_RMERR "An error >> occurred in rolling back the transaction branch. The resource manager is >> free to forget about the branch when returning this error so long as all >> accessing threads of control have been notified of the branch’s state." > >> That does not sound right to me, wouldn't XAER_NOTA "The specified XID >> is not known by the resource manager" be more accurate? > > No idea, but in any case that's outside Postgres' purview. It's barely > possible that the Postgres JDBC driver has something to do with that, > but it sounds more like the XA manager's turf. I am not sure what you mean here as I don't know the structure of how the PostGres project is packaged, all I know is that the PostGres JDBC driver component appears to be returning an XAException with the message "Error rolling back prepared transaction" and an errorCode of XAException.XAER_RMERR rather than XAER_NOTA. Is there a different component within your bug tracking system we should be using to raise this against the JDBC driver instead? Thanks, Tom
Hi Tom, A little bit of information in the linked bugzilla report is that the exception being returned has an XA error code of XAER_RMERR "An error occurred in rolling back the transaction branch. The resource manager is free to forget about the branch when returning this error so long as all accessing threads of control have been notified of the branch’s state." That does not sound right to me, wouldn't XAER_NOTA "The specified XID is not known by the resource manager" be more accurate? Thanks, Tom On 29/07/13 14:50, Tom Lane wrote: > Ondrej Chaloupka <ochaloup@redhat.com> writes: >> The OTS specification requires both bottom up and top down recovery to be triggered by the recovering resource. This causesthat two rollback calls are done against the DB. DB receives rollback call and does the rollback. Then for the secondtime it returns the exceptional code. As the DB already rollbacked the transaction and forgot about it the DB returnserror that no such transaction exists. But this seems to be against OTS specification. > It's not likely that we would consider changing the behavior of ROLLBACK > PREPARED. The alternatives we would have are (1) silently accept a > ROLLBACK against a non-existent transaction ID, or (2) remember every > rolled-back ID forever. Neither seems sane in the least. > > It seems to me that this is something client-side code, probably the XA > manager, would need to deal with. The XA manager already has to track > uncommitted 2-phase transactions, and would furthermore have the best > idea of when it would be safe to forget about a rolled-back ID. > > Right offhand it appears to me that that Red Hat bug is filed against > the correct component, and you need to push them harder to fix their > bug/shortcoming rather than claim it's our problem. > > regards, tom lane
On Jul 29, 2013, at 16:57, Tom Jenkinson <tom.jenkinson@redhat.com> wrote: > Hi Tom, > > On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >> Tom Jenkinson <tom.jenkinson@redhat.com> writes: >>> A little bit of information in the linked bugzilla report is that the >>> exception being returned has an XA error code of XAER_RMERR "An error >>> occurred in rolling back the transaction branch. The resource manager is >>> free to forget about the branch when returning this error so long as all >>> accessing threads of control have been notified of the branch’s state." >> >>> That does not sound right to me, wouldn't XAER_NOTA "The specified XID >>> is not known by the resource manager" be more accurate? >> >> No idea, but in any case that's outside Postgres' purview. It's barely >> possible that the Postgres JDBC driver has something to do with that, >> but it sounds more like the XA manager's turf. > > I am not sure what you mean here as I don't know the structure of how the PostGres project is packaged, all I know is thatthe PostGres JDBC driver component appears to be returning an XAException with the message "Error rolling back preparedtransaction" and an errorCode of XAException.XAER_RMERR rather than XAER_NOTA. Looking at the error codes, it appears that it isn't even the Postgres JDBC driver returning that error, but the XA manageryou're using, which is not a part of Postgres (nor is the JDBC driver, for that matter - that's a separate project). The errors you're quoting are from the XA manager and are about XA manager stuff. For all we know, the actual error appearsto be occuring in the XA manager and not in Postgres. It's possible that the XA manager error is a result of an errorthat Postgres returned, but since the XA manager prints its own error message and not the original one, you'll needto uncover those error messages before we can help you with them. For all we know at this point, the error is with your XA manager, not with Postgres. If you want to be sure, grep the source of the JDBC driver for those error codes; I doubt you'll find them in there. Google was kind enough to point me here: http://jdbc.postgresql.org/development/git.html Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
Hi Alban, I stripped down the code to a raw XA example using the latest postgres driver available in maven central. It demonstrates that regardless of what the codebase might suggest, it is certainly the case that postgres is returning XAER_RMERR in the scenario where the resource manager no longer knows about the Xid. The code is available here: https://github.com/tomjenkinson/xa-recovery/commit/944d45e86a91eacb9489843acfbf6a80f1b4b820 I hope that this helps, Tom On Mon 29 Jul 2013 18:52:31 BST, Alban Hertroys wrote: > On Jul 29, 2013, at 16:57, Tom Jenkinson <tom.jenkinson@redhat.com> wrote: > >> Hi Tom, >> >> On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >>> Tom Jenkinson <tom.jenkinson@redhat.com> writes: >>>> A little bit of information in the linked bugzilla report is that the >>>> exception being returned has an XA error code of XAER_RMERR "An error >>>> occurred in rolling back the transaction branch. The resource manager is >>>> free to forget about the branch when returning this error so long as all >>>> accessing threads of control have been notified of the branch’s state." >>> >>>> That does not sound right to me, wouldn't XAER_NOTA "The specified XID >>>> is not known by the resource manager" be more accurate? >>> >>> No idea, but in any case that's outside Postgres' purview. It's barely >>> possible that the Postgres JDBC driver has something to do with that, >>> but it sounds more like the XA manager's turf. >> >> I am not sure what you mean here as I don't know the structure of how the PostGres project is packaged, all I know isthat the PostGres JDBC driver component appears to be returning an XAException with the message "Error rolling back preparedtransaction" and an errorCode of XAException.XAER_RMERR rather than XAER_NOTA. > > > Looking at the error codes, it appears that it isn't even the Postgres JDBC driver returning that error, but the XA manageryou're using, which is not a part of Postgres (nor is the JDBC driver, for that matter - that's a separate project). > > The errors you're quoting are from the XA manager and are about XA manager stuff. For all we know, the actual error appearsto be occuring in the XA manager and not in Postgres. It's possible that the XA manager error is a result of an errorthat Postgres returned, but since the XA manager prints its own error message and not the original one, you'll needto uncover those error messages before we can help you with them. > > For all we know at this point, the error is with your XA manager, not with Postgres. > > If you want to be sure, grep the source of the JDBC driver for those error codes; I doubt you'll find them in there. > Google was kind enough to point me here: http://jdbc.postgresql.org/development/git.html > > Alban Hertroys > -- > If you can't see the forest for the trees, > cut the trees and you'll find there is no forest. >
Tom Jenkinson escribió: > Hi Alban, > > I stripped down the code to a raw XA example using the latest > postgres driver available in maven central. It demonstrates that > regardless of what the codebase might suggest, it is certainly the > case that postgres is returning XAER_RMERR in the scenario where the > resource manager no longer knows about the Xid. > > The code is available here: > https://github.com/tomjenkinson/xa-recovery/commit/944d45e86a91eacb9489843acfbf6a80f1b4b820 Those error codes do certainly appear in the PGXAConnection.java source in the pgjdbc git. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi Tom, The driver currently doesn't report back to the calling client (tm) XAException.XAER_NOTA code as Ondrej and Tom Jenkinson have identified. Instead it returns XAException.XAER_RMERR. See line 416 https://github.com/pgjdbc/pgjdbc/blob/master/org/postgresql/xa/PGXAConnection.java#416 which imo is used for general errors in the resource manager. I've written a test case that can be pulled into the pgjdbc testsuite that will make verifying this error easier. It is based on the example code Tom Jenkinson provided. A pull request has been created which can be found here... https://github.com/pgjdbc/pgjdbc/pull/73 I am currently coding up a change to the driver in anticipation there is agreement in the pgjdbc group to change the rollback method. Another pull request will be created for that. Let's see what discussion and decision is made by the more active members in pgjdbc. Regards, Jeremy On 29/07/13 16:11, Tom Lane wrote: > Tom Jenkinson <tom.jenkinson@redhat.com> writes: >> On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >>> No idea, but in any case that's outside Postgres' purview. It's barely >>> possible that the Postgres JDBC driver has something to do with that, >>> but it sounds more like the XA manager's turf. >> I am not sure what you mean here as I don't know the structure of how >> the PostGres project is packaged, all I know is that the PostGres JDBC >> driver component appears to be returning an XAException with the >> message "Error rolling back prepared transaction" and an errorCode of >> XAException.XAER_RMERR rather than XAER_NOTA. >> Is there a different component within your bug tracking system we >> should be using to raise this against the JDBC driver instead? > The folk who would fix anything in the JDBC driver tend to read > pgsql-jdbc sooner than pgsql-bugs, so cc'ing there for comment. > > regards, tom lane > >
Hello Tom, A quick update on progress. A second PR was created to provide a patch. https://github.com/pgjdbc/pgjdbc/pull/76 Regards, Jeremy On 31/07/13 12:36, Jeremy Whiting wrote: > Hi Tom, > The driver currently doesn't report back to the calling client (tm) > XAException.XAER_NOTA code as Ondrej and Tom Jenkinson have identified. > Instead it returns XAException.XAER_RMERR. See line 416 > > https://github.com/pgjdbc/pgjdbc/blob/master/org/postgresql/xa/PGXAConnection.java#416 > > which imo is used for general errors in the resource manager. > > I've written a test case that can be pulled into the pgjdbc testsuite > that will make verifying this error easier. It is based on the example > code Tom Jenkinson provided. A pull request has been created which can > be found here... > > https://github.com/pgjdbc/pgjdbc/pull/73 > > I am currently coding up a change to the driver in anticipation there > is agreement in the pgjdbc group to change the rollback method. Another > pull request will be created for that. Let's see what discussion and > decision is made by the more active members in pgjdbc. > > Regards, > Jeremy > > On 29/07/13 16:11, Tom Lane wrote: >> Tom Jenkinson <tom.jenkinson@redhat.com> writes: >>> On Mon 29 Jul 2013 15:46:12 BST, Tom Lane wrote: >>>> No idea, but in any case that's outside Postgres' purview. It's barely >>>> possible that the Postgres JDBC driver has something to do with that, >>>> but it sounds more like the XA manager's turf. >>> I am not sure what you mean here as I don't know the structure of how >>> the PostGres project is packaged, all I know is that the PostGres JDBC >>> driver component appears to be returning an XAException with the >>> message "Error rolling back prepared transaction" and an errorCode of >>> XAException.XAER_RMERR rather than XAER_NOTA. >>> Is there a different component within your bug tracking system we >>> should be using to raise this against the JDBC driver instead? >> The folk who would fix anything in the JDBC driver tend to read >> pgsql-jdbc sooner than pgsql-bugs, so cc'ing there for comment. >> >> regards, tom lane >> >> > -- Jeremy Whiting Senior Software Engineer, Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Michael Cunningham (USA), Mark Hegarty(Ireland), Matt Parson (USA), Charlie Peters (USA)
On 05.08.2013 17:58, Jeremy Whiting wrote: > Hello Tom, > A quick update on progress. A second PR was created to provide a patch. > > https://github.com/pgjdbc/pgjdbc/pull/76 Thanks. Looks good to me. I wish the backend would throw a more specific error code for this, 42704 is used for many other errors as well. But at COMMIT/ROLLBACK PREPARED, it's probably safe to assume that it means that the transaction does not exist. - Heikki