Re: [pgjdbc] XADataSource support for resource sharing & interleaving. (#47) - Mailing list pgsql-jdbc

From Dave Cramer
Subject Re: [pgjdbc] XADataSource support for resource sharing & interleaving. (#47)
Date
Msg-id CADK3HH+mSmgESAOXMcaOx2S60SBmegw9d7B_8qAw=Fkiin4ZaA@mail.gmail.com
Whole thread Raw
Responses Re: Re: [pgjdbc] XADataSource support for resource sharing & interleaving. (#47)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Re: Re: [pgjdbc] XADataSource support for resource sharing & interleaving. (#47)  (Kevin Grittner <kgrittn@ymail.com>)
Re: Re: [pgjdbc] XADataSource support for resource sharing & interleaving. (#47)  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-jdbc
I'm looking for opinions on this patch.

Heikki ? Kevin Grittner ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Fri, Mar 15, 2013 at 12:45 PM, Bryan Varner <notifications@github.com> wrote:

This reimplementation of the PGXADataSource (and supporting classes) provides more complete fulfillment of the JTA 1.0.1 specification.

It properly implements resource sharing (3.4.6) allowing all connections to the same underlying XAResource to control the global transactions created by other instances of that XAResource.

It addresses proper interleaving and isolation of local & global transaction scopes. (3.4.7)

The support for interleaving and sharing is implemented by (potentially, only when needed) returning 'logical' connections from the XADataSource, and opening new physical connections to the backend server (when necessary) to handle preforming work on behalf of a global TX or local TX.

The behavior of opening new physical backends (implementing interleaving) can be disabled by setting the XAAcquireTimeout = 0, in which case the behavior of the old XADataSource implementation is retained, with a minimal performance impact.

It tries hard to only open as many physical connections as necessary to service the logical connections, and prunes extraneous connections created as a result of supporting interleaving at each logical connection handle close().

In our production environment, where we have allocated our pool sizes to the actual observed in-use highwater mark + 30%, we are seeing zero connection churn, as there are enough free connections at any given time to properly service the checked-out, in-use logical connections and the interleaving & resource sharing being invoked by the container.

This implementation passes:

  • all the existing XADataSource test cases
  • the many we've added to test corner cases of XA handling in concurrent and pooled workloads.
  • the JBoss narayana Transaction Manager jdbc unit tests. (https://github.com/jbosstm/narayana)

In addition to the XADataSource changes

  • the build.xml has been update to respect (fully) the srcdir property
  • AbstractJdbc23PooledConnection exposes getBackingConnection() to sub-classes, allowing subclasses access to the underlying Connection object without needing to cache the instance in their constructors.

We have been successfully using this snapshot of code in our production environment (GlassFish 3.1.2.2) with PostgreSQL 9.1, and are pleased to report that it is working very well at this time.


You can merge this Pull Request by running

  git pull https://github.com/polarislabs/pgjdbc POLARIS_XA

Or view, comment on, or merge it at:

  https://github.com/pgjdbc/pgjdbc/pull/47

Commit Summary

  • This is an in-progress refactor attempt with the express goals of providing a complete, and robust JTA compliant XA Implementation.
  • Stubbing out / making progress on this.
  • Ok, whoops.
  • I'm not totally sold on logicalConnectionId at this point, but it might work.
  • Started stubbing out / reconciling original precondition / deficiences, this should help cement the types of things we'll need the XADataSource to handle.
  • "A broken build is a sign of progress!"
  • implementation should be somewhat possible to execute.
  • That shouldn't be there.
  • helps to remember to unignore things.
  • updated the gitignore
  • I realize this this may mean some of our work does not get accepted upstream, but I can't abide not having the source in a source directory anymore, where netbeans (i know, i know) can actually resolve the proper package names and what-not. It's amazing I lasted this long without doing this. Seriously, I'm tempted to make this a maven build.
  • Some cleanup, and getting the build.xml to behave a little better.
  • Forgot I moved the getPhysicalConnection(), wow I should back off the coffee.
  • This is actually starting to pass test cases.
  • FUNCTIONAL PARITY.
  • Structuring the build to renable the test suite.
  • Minor changes, not really sure what...
  • Adding the nbignore...
  • added idea files to gitignore
  • Updating a few things.
  • we just passed the old interleaving tests.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Added docs, reformatted some stuff, cleaned up some code
  • Merged from Bryan's chanages to pass all the tests.
  • Implements Referenceable, Serializable.
  • Tweaking these to be more aggressive by default.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Code paranoia
  • Added some documentation
  • Abstracted connection creation, yes I actually have a use-case for this.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Moved logical connection if checks to the top to throw a IllegalArgumentException since it is required for the method.
  • Added some debug logging
  • Don't commit the out folder
  • Added idea's out folder to the ignore
  • Moved the autocommit bits to the logical connection part. I think this may help.
  • Updated docs and wrapped debug statement to avoid performance hit when logging not desired
  • Logger might as well be local if it's only going to be used in the one place.
  • Changing up the physical pooling policies and connection cleanup cycle.
  • Disassociate so we have a shot at closing things.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Cleaning up some leftovers.
  • Improved the first interleaving test, to exercise isolation checks in the midst of interleaving.
  • Added a test case for join, which is exposing an actual bug.
  • Added a few tests cases which are exposing some problems centered around connection.close(), and the handling of things on the backends.
  • Added a lot of debug logging to PGXADataSource and PhysicalXAConnection.
  • Added lib folder to gitignore to allow ant builds directly from idea without committing the junit jar.
  • Added a TODO to indicate improvement needed on Logger creation.
  • Added some extra logging and documentation.
  • Fixed typo
  • Fixed local thread test. The thread wasn't being joined before checking the error value. So, the error value was being checked before the query had a chance to complete.
  • Cleaned up imports.
  • Getting the physical connection-based logger would require too much rework at this point, so just use the Driver-base logger instead.
  • Fixed null pointer exception since the fake xid class doesn't null check in its equals method
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • synchronized a section of code that might cause a race condition while trying to get a logical connection that doesn't have a transaction started yet in order to issue a commit prepared.
  • This test can expose a couple of race conditions. Though because of the nature of them, the test passes ... sometimes.
  • Blunt-Edged, Brute-Force approach. Lock all the things!
  • Might as well go all Reentrant
  • Now start to rein it in
  • Completely re-implemented.
  • Clean up / documentation of a failure point.
  • Removed all the rediculous local TX tracking in the logical connection side. Really we need to care about the autocommit of the physical TX, which we should only be mangling when we associate / disassociate to an XID. A logical connection should always defer it's autocommit status to that of the physical connection servicing it.
  • If a physical connection is in local TX mode, try to keep it to one logical connection per physical...
  • Added a simple test of a transaction commit on a pooled connection to simulate app servers connection pooling. This reproduces the "double-close" issue we have been seeing.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Close behavior now waits for the logical connection to close, rather than the handle.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Keep the username / password around, so that we can allocate new physical connections if all of them are closed and a logical connection tries to do something.
  • Commiting changes to the wait timeouts and strategy if you set the timeout to 0.
  • Added threaded pool test that exhausts the pool.
  • Merge branch 'POLARIS_XA' of github.com:polarislabs/pgjdbc into POLARIS_XA
  • Allow for disabling interleaving support. We'll probably want to do this by default, and let people enable it when they need it.
  • Hey, we weren't using that logger...
  • Hey, we weren't using that logger...
  • Better support for people who want to disable interleaving, and have a hard & fast 1 - 1 logical to physical mapping.
  • Added a test case that exercises (intermittently, however) the race condition between prepare & commit / close.
  • These test cases are duplicating the exact type of intermittent failure we're seeing in our deployed environment.
  • More updates, we can close anything IDLE. But not OPEN or marked for ROLLBACK (at least not without allowing the rollback first!).
  • I haven't run this against the JBOSS TM QA tests yet, but at this point, it's passing all the tests we have (including the race-condition tests for close() logic) that we have.
  • Whoops. build.xml changes crept in.
  • Rollback only disassociates the xid, not the entire physical connection. In the case of a Connection pegged to a thread, the next invocation could end up royally messing with the state of the pool.
  • Passes the test where the last logical connection closed closes any non-suspended TX.
  • Verify that we end up on the same LocalTX transaction (backend pg PID) as we did when we started.
  • Moving things back to the original location.
  • Cleaning up build.xml changes
  • More cleanup, finer-grained locking in a few spots.
  • Added 'debugtest' target which sets up a dt_socket server for debugging test cases.
  • Broke resource sharing, where the XAResource can outlive the logical connection (apparently).
  • Fixed a few issues with close before rollback.
  • Eager association of logicals to physicals was causing connection bloat.
  • Greatly reduced connection usage, by improving re-use, and only allocating physical backends when it's necessary to provide a reasonable level of performance.
  • Added a test case to verify that we weren't closing connections we shouldn't be.
  • Even more tests about connection association...
  • Fixed the wait strategy to wait up to xaAcquireTimeout milliseconds (in multiple passes if necessary) before allocating a new physical connection.
  • RecoveredXid is blowing up when GlassFish tries to ping the XAResource with null xids.
  • Merge remote-tracking branch 'upstream/master' into POLARIS_XA
  • Adding a test case that I think will duplicate the prepared statement / cursor issue we're seeing with connection closing.
  • Uh, we don't need 10k of these..
  • Updated test case to use cursors.
  • Cleaned up / simplified some of the close() code.

File Changes

Patch Links:


pgsql-jdbc by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Improper type conversion from smallint to short
Next
From: "Gabriel E. Sánchez Martínez"
Date:
Subject: Client Certificate Authentication