Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set - Mailing list pgsql-performance

From Dave Crooke
Subject Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set
Date
Msg-id x2zca24673e1004201229n653f4e69vd274af4a4aa22d09@mail.gmail.com
Whole thread Raw
In response to Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set
Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set
List pgsql-performance
I don't want to get into a big debate about standards, but I will clarify a couple of things inline below.

My key point is that the PG JDBC driver resets people's expecations who have used JDBC with other databases, and that is going to reflect negatively on Postgres if Postgres is in the minority, standards nothwithstanding, and I feel badly about that, because PG rocks!

Cheers
Dave

On Tue, Apr 20, 2010 at 11:32 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Dave Crooke <dcrooke@gmail.com> wrote:

> AFAICT from the Java end, ResultSet.close() is supposed to be
> final.

For that ResultSet.  That doesn't mean a ResultSet defines a cursor.
Such methods as setCursorName, setFetchSize, and setFetchDirection
are associated with a Statement.  Think of the ResultSet as the
result of a cursor *scan* generated by opening the cursor defined by
the Statement.

http://java.sun.com/javase/6/docs/api/java/sql/ResultSet.html#close%28%29

Notice that the ResultSet is automatically closed if the Statement
that generated it is re-executed.  That is very much consistent with
Statement as the equivalent of a cursor, and not very consistent
with a ResultSet as the equivalent of a cursor.

True, but mechanically there is no other choice - the ResultSet is created by Statement.executeQuery() and by then it's already in motion .... in the case of Postgres with default settings, the JVM blows out before that call returns.

I am not explicitly creating any cursors, all I'm doing is running a query with a very large ResultSet.
 

Again, you're talking about the *results* from *opening* the cursor.

> At a pragmatic level, the PGSQL JDBC driver has a lot of odd
> behaviours which, while they may or may not be in strict
> compliance with the letter of the standard, are very different
> from any other mainstream database that I have accessed from Java
> .... what I'd consider as normative behaviour, using regular JDBC
> calls without the need to jump through all these weird hoops, is
> exhibited by all of the following: Oracle, SQL Server, DB2, MySQL,
> Apache Derby and JET (MS-Access file-based back end, the .mdb
> format)

Are you talking about treating the Statement object as representing
a cursor and the ResultSet representing the results from opening
the cursor, or are you thinking of something else here?

Specific examples:

a. the fact that Statement.executeQuery("select * from huge_table") works out of the box with every one of those databases, but results in java.langOutOfMemory with PG without special setup. Again, this is to the letter of the standard, it's just not very user friendly.

b. The fact that with enterprise grade commercital databases, you can mix reads and writes on the same Connection, whereas with PG Connection.commit() kills open cursors.

The fact that I've been using JDBC for 12 years with half a dozen database products, in blissful ignorance of these fine distinctions in the standard until I had to deal with them with PG, is kinda what my point is :-)

I understand the reasons for some of these limitations, but by no means all of them.
 
Huh.  I dropped PostgreSQL into an environment with hundreds of
databases, and the applications pretty much "just worked" for us.
Of course, we were careful to write to the SQL standard and the JDBC
API, not to some other product's implementation of them.

True, but not everyone can hire every developer to be a JDBC / SQL language lawyer. All of our SQL is either ANSI or created by the Hibernate PGSQL adapter, with the exception of a daily "VACUUM ANALYSE" which I added ;-)

I do believe that when there are two ways to implement a standard, the "it just works" way is far preferable to the "well, I know you probably think this is a bug, because 90% of the client code out there chokes on it, but actually we are standards compliant, it's everyone else who is doing it wrong" way.

I used to work at a storage startup that did exactly the latter, using an obscure HTTP/1.1 standard feature that absolutely none of the current browsers or HTTP libraries supports, and so it was a constant source of frustration for customers and tech support alike. I no longer work there ;-)

It's kinda like making stuff that has to work with Windows - you know Microsoft doesn't follow it's own standards, but you gotta make our code work with theirs, so you play ball with their rules.


(1)  Needing to setFetchSize to avoid materializing the entire
result set in RAM on the client.

I don't understand the rationale for why PG, unlike every other database, doesn't make this a sensible default, e.g, 10,000 rows ... maybe because the locks stay in place until you call Connection.close() or Connection.commit() ? ;-)
 

(2)  Fixing a race condition in our software which was benign in
other products, but clearly my own bug.

Been there and done that with code developed on single-threaded DB's (JET / Derby) ... not what I'm griping about here though, the base code with no extra JDBC setup calls works perfectly against Oracle.
 
Just out of curiosity, did you discuss that on the PostgreSQL lists?
Can you reference the thread(s)?

No, I was in a hurry, and the "just works" model was available with both MySQL and Berkeley DB, so I didn't see the point in engaging. I felt the in house paranoia about the MySQL licensing (our CFO) was not justified, and it was the devil I knew, I was taking a look at PG which was then foreign to me as a "genius of the and" alternative.
 

Sure, but I would consider it a step away from that to follow
MySQL's interpretation of cursors rather than the standard's.
YMMV, of course.

I wouldn't hold MySQL up to be a particularly good implmentation of anything, other than speed (MyISAM) and usability (the CLI) .... I find Oracle's JDBC implmentation to be both user friendly and (largely) standards compliant.

YMMV too :-)

I hope this can be taken in the amicable spirit of gentlemanly debate in which it is offered, and in the context that we all want to see PG grow and continue to succeed.

Cheers
Dave

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Very high effective_cache_size == worse performance?
Next
From: Scott Marlowe
Date:
Subject: Re: performance change from 8.3.1 to later releases