Re: Retrieving ResultSets with millions of rows - - Mailing list pgsql-jdbc
From | Dave Cramer |
---|---|
Subject | Re: Retrieving ResultSets with millions of rows - |
Date | |
Msg-id | 1031965086.1955.177.camel@inspiron.cramers Whole thread Raw |
In response to | Re: Retrieving ResultSets with millions of rows - (Doug Fields <dfields-postgres@pexicom.com>) |
List | pgsql-jdbc |
On Fri, 2002-09-13 at 20:45, Doug Fields wrote: > > >Ya, there is, it's a bit of a hack but the only way around it is to use > >a cursor. Eventually the driver will do this automatically, but for now > >you will have to do it manually > > That's what I was afraid you were going to say. Thanks for the response. > > Question: Do I have to do anything unusual with the JDBC Connection and/or > Statement to put it in a certain mode? I'm not much on the interactions > between the JDBC API and the database transactions (begin/commit, etc.). I > also use the Protomatter connection pool; what do I have to do with the > connection when I release it back to the pool? yes, turn autocommit off, and then to be safe turn it back on before returning it to the pool. > > Also, a slightly tangential question: If I'm only doing reads in the > transaction (the cursor stuff), is there any performance difference if I > COMMIT or ROLLBACK? with autocommit off you don't have to do either, just do an end; > > Thanks, > > Doug > > > >so > >begin; > >declare cursor1 CURSOR FOR SELECT .... > >fetch n from cursor1 ... > >end; > > > >Dave > >On Fri, 2002-09-13 at 17:43, Doug Fields wrote: > > > Hello, > > > > > > I've just come to the startling realization that the 7.2 JDBC code loads > > > every single row in a ResultSet into a Vector. (core/QueryExecutor.java) > > > > > > Unfortunately, I run some queries which return millions of rows, each row > > > which could well be 100-1000 bytes and more. > > > > > > Hence, I get an OutOfMemoryError. > > > > > > For some queries, there's an obvious workaround: issue the query with an > > > appropriate suffix "ORDER BY <something> LIMIT <something> OFFSET > > > <something>" several times in succession. This will, of course, work > > mostly > > > fine assuming it's a simple single-table query with an appropriate index > > > (such as a primary key). > > > > > > However, some of my queries are complex one-to-many joins with no > > > particular ordering (or no index for quick ordering). These would be much > > > harder to do that trick with, and/or incur extreme amounts of database > > > overhead in running the query hundreds of times (for example, if I were to > > > handle 10,000 rows at a time). > > > > > > Is there any way to get a ResultSet which is actually live streamed, which > > > keeps no knowledge of previous rows or the row count? It seems utterly > > > wasteful to me that I should need to load millions of rows into a Vector > > > (try an ArrayList next time, for minor speedups) when all I want to do is > > > stream the results, not ever needing to know the total number of rows > > ahead > > > of time nor any previous (or subsequent) rows to the "current one" from > > > ResultSet.next(). > > > > > > I can't imagine I'm the first person to be using JDBC to access tables > > with > > > tens of millions of rows to attempt to access significant portions of them > > > at a time. > > > > > > Thanks, > > > > > > Doug > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 5: Have you checked our extensive FAQ? > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > >
pgsql-jdbc by date: