Re: A new JDBC driver... - Mailing list pgsql-jdbc

From Craig Ringer
Subject Re: A new JDBC driver...
Date
Msg-id 5142C964.8090402@2ndquadrant.com
Whole thread Raw
In response to Re: A new JDBC driver...  (Kevin Wooten <kdubb@me.com>)
Responses Re: A new JDBC driver...  (Kevin Wooten <kdubb@me.com>)
List pgsql-jdbc
On 03/15/2013 01:15 PM, Kevin Wooten wrote:
Craig, thanks a lot for the information.  I read you SO question and related info, and then did a bit of researching on my own.  I have come up with a few things…

Just to clarify, it seems you thought I was asking for help with that question. I wasn't, I just wanted to direct you to the informative answer given and raise your awareness of the concerns around threading in Java EE application servers in case you weren't aware of the problems there.

If you're unaware of classloader leaks and the related problems, start here:

http://frankkieviet.blogspot.com.au/2006/10/classloader-leaks-dreaded-permgen-space.html

Starting your own threads is not suggested because of the problems with not have a definitive startup/shutdown (until more recent JavaEE versions) and any thread you start cannot use any of the services of the container.  Everything else I have read just says "IF" you don't shut the threads down properly you could wreak havoc,  eat up resources, etc, etc.
Yes... and you need to implement ServiceLoader hooks in order for threads created by the JDBC driver to be properly terminated before the driver is unloaded.

Remember that in Java EE the JDBC driver may be unloaded without the app server being terminated. This mainly occurs in apps that bundle JDBC drivers in their .war instead of using the server connection pool, but is also seen when JDBC drivers are uninstalled or unloaded from a running server as is common in upgrades.

If the driver has its own threads and it doesn't terminate every thread in the pool the unload will fail and you'll get a classloader leak.


The first issue isn't an issue at all.  The threads handle the I/O only and all data is delivered back to an application level thread for processing. Basically the fact that threads are used is completely transparent to the application code; it treats the calls synchronously.
To the application code yes, to the app server, not necessarily, because the threads hold references to classes that hold references to the .war classloader in the case of in-app-bundled JDBC drivers. Unlike every other kind of Java object a thread is not GC'd if its refcount falls to zero as a thread is a GC root. So if a leaked connection is GC'd the associated threads may not be. You cannot rely on the finalize method to reliably or correctly clean up the threads as it may not run under far too many circumstances.

For that reason, it's important for the driver to keep track of its threads and ensure that it explicitly shuts down its thread pool when it is unloaded by the service loader.

Secondly, I have actually paid a bit of attention to the issue of threads shutting down because any abandoned connection causes the threads to remain active and the program to, at the very least, not be able to shutdown.
That's not the case if you're creating daemon threads (which you should be) - but it can cause other problems with object leaks. Even if you hold only weak references in your thread the thread its self can still cause a classloader leak since it holds a strong reference to its class object (or in the case of an unsubclassed Thread, to the runnable and via it to the runnable's class).

I used a reference counting system to share the thread pool between connections.  This guarantees that if the connections are properly closed, the threads will be killed.  I did a bit of experimentation last night with using weak references everywhere and trying to handle the case where a person forgets to close a connection.
Good idea. The driver *must* continue to function when connections aren't closed, as this is unfortunately extremely common. Relying on finalizers won't cut it in my opinion, they're just too unreliable to use for anything except logging warnings to say you forgot to close something.

The main issues with finalizers isn't performance, it is the number of circumstances under which they just don't run, run multiple times, or otherwise have undesirable characteristics.

PostgreSQL's cancel system is pretty interesting. You have to open a new socket and send a cancel packet.
Yeah, I find this to be particularly awful myself. I vaguely recall seeing some discussion suggesting that in-band cancel was possible, but I'll have to delve through my archives to find out where.
Finally, with regard to your SO question since there seems to be no answer, you could try, as I touched on earlier, and implement the query timeout by using non-blocking sockets, selectors and the like.  I think you'll quickly grow to appreciate why others are using threads; Java has made something that was easy in C, very hard.  Also, in my journeys last night I discovered the statement "statement_timeout" connection parameter.  If you didn't know about it already, the server will cancel any statement that takes longer than this value.  It may be an easy solution to your problem.
Heh, I'm well aware of statement_timeout. In that question I was asking for guidance about sane ways to use threading within the JDBC driver as I was hoping to have time to implement threaded I/O within PgJDBC to finally solve the problem with the driver deadlocking when it's blocked on a read and the server won't reply until it gets another write. As you've alluded to, async I/O wasn't really an option especially since it plays very poorly with SSL sockets. The answer referring to the service loader was very helpful and that's what I was pointing you towards.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

pgsql-jdbc by date:

Previous
From: Kevin Wooten
Date:
Subject: Re: A new JDBC driver...
Next
From: Craig Ringer
Date:
Subject: Re: A new JDBC driver...