Thread: java.lang.OutOfMemoryError
I'm querying a table that has nearly 3 million records in it and when I call executeQuery() I get an a java.lang.OutOfMemoryError message. My immediate thought was that it was strange that a safe fetch size wasn't used automatically so I then I decided to explicitly the fetch size by calling the Connection classes setFetchSize() method. Apparently this is not implemented in the pg73jdbc3.jar driver. I am surprised that the default fetch size is so great as to cause an OutOfMemoryError. What is the proper method for using JDBC to query result sets which are potentially large? Thanks, Paul
Paul Cullum wrote: > What > is the proper method for using JDBC to query result sets which are > potentially large? I had a similar problem, but I never got an exception. I will bring that up in another thread. I didn't see anyone else more knowledgable respond to this, so I will give it a shot. From what I have read in the JDBC driver, it pulls the entire resultset across the socket, caching it locally in a vector of rows [or something to that effect]. I, too, used a driver that fetched at need [Informix], and not all at execution time. It was very nice. :) From what I have seen on this maillist, there are possibly some developers working on enhancing the drivers to allow fetch-at-need [however, I'm not sure if the backend supports it]. Anyway, I believe the current method for handling this is to use the LIMIT/OFFSET clause in your PG SQL command. See: http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/sql-select.html#SQL-LIMIT With that clause, you can control exactly which records you want for the query. Problems with this? Possibly... I'm thinking that since every "chunk" you ask for is actually a different query, your results might not be properly contiguous between queries. For example, someone could insert or delete a record that will or would have been in the middle of the data you are looking for. This could cause your next "chunk" to contain a duplicate record from the first chunk (in the case of an insert by someone else), or to skip a record (in the case of a deletion). It may be possible to control this situation by doing all chunk queries in a single transaction, but I'm not sure. Perhaps someone more knowledgable can talk about that. For those of us converting from other jdbc drivers that DID fetch-at-need, this is truly troublesome. We may have coded more lax systems that allow users to execute queries that could crash our entire VM under PG-JDBC! [I have a screen that allows the user to input several variables for a search query. It is impossible at times for me to know how large the resultset might be.] If you give your users ways to query your system, then with PG, you have to be very careful how much power they are given. Sometimes it is very hard to prevent large queries from squeaking by. This makes a fully-read-execution jdbc driver a deadly weapon. Anybody have any ideas about what we can do to speed the development of a fetch-at-need jdbc driver? Can we patch / extend the driver to allow us to set an absolute maximum resultset length on a query and alert us when it would have been exceeded WITHOUT it causing our system to go haywire / crash / get OOME? -- Matt Mello
Matt Mello wrote: > Can we patch / extend the driver to allow us to set an absolute maximum > resultset length on a query and alert us when it would have been > exceeded WITHOUT it causing our system to go haywire / crash / get OOME? Oh, and apparantly Statement.setMaxRows(int max) works, in response to my own question about how to absolutely limit a resultset. -- Matt Mello
Also see: http://archives.postgresql.org/pgsql-jdbc/2003-02/msg00042.php I have not used the above patches yet, but plan to for those queries whose approximate size I do not know in advance. -- Matt Mello
Paul, A patch was just applied to cvs head to better deal with this. The new behavior is that by default you get the old behavior, but if you call setFetchSize() it will turn on using cursors to incrementally fetch the result. thanks, --Barry PS. I haven't yet got a new build for the web site, but should in a few days, so in the meantime, if you want to try this out, you will need to build from cvs sources. Paul Cullum wrote: > I'm querying a table that has nearly 3 million records in it and when I > call executeQuery() I get an a java.lang.OutOfMemoryError message. My > immediate thought was that it was strange that a safe fetch size wasn't > used automatically so I then I decided to explicitly the fetch size by > calling the Connection classes setFetchSize() method. Apparently this > is not implemented in the pg73jdbc3.jar driver. I am surprised that > the default fetch size is so great as to cause an OutOfMemoryError. What > is the proper method for using JDBC to query result sets which are > potentially large? > > Thanks, > Paul > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Hello Barry, What happen if I call two or more statement with setFetchSize, btw. 1. setFetchSize 10 for query on table a and table b 2. select * from a select 10 rows from a 3. select * from b select 10 rows from b 4. I want rows 11-20 from a, now 5. next 10 rows (11-20) from b etc in step 2 'execute' for query table a call 'BEGIN; DECLARE CURSOR ...' in step 3 'execute' for query table b call again 'BEGIN; ...' Can I do this without nested transaction (savepoint) I think that close in ResultSet or Statement must call 'COMMIT' OR 'ROLLBACK', too Can You set parametar in driver URL (true or false) that use cursor automagic for true ? regards Haris Peco On Tuesday 04 February 2003 16:42, Barry Lind wrote: > Paul, > > A patch was just applied to cvs head to better deal with this. The new > behavior is that by default you get the old behavior, but if you call > setFetchSize() it will turn on using cursors to incrementally fetch the > result. > > thanks, > --Barry > > PS. I haven't yet got a new build for the web site, but should in a few > days, so in the meantime, if you want to try this out, you will need to > build from cvs sources. > > Paul Cullum wrote: > > I'm querying a table that has nearly 3 million records in it and when I > > call executeQuery() I get an a java.lang.OutOfMemoryError message. My > > immediate thought was that it was strange that a safe fetch size wasn't > > used automatically so I then I decided to explicitly the fetch size by > > calling the Connection classes setFetchSize() method. Apparently this > > is not implemented in the pg73jdbc3.jar driver. I am surprised that > > the default fetch size is so great as to cause an OutOfMemoryError. What > > is the proper method for using JDBC to query result sets which are > > potentially large? > > > > Thanks, > > Paul > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 4 Feb 2003 19:22:06 +0000, snpe wrote: Hello snpe, hello Barry, as I saw just now, Nick's patch was committed... thanks a lot. So I have to do the merge work I've done in early january a second time... :-( I corrected numerous bugs, I know it was a Megapatch, but as I see now, the users WANT these URL parameters, they WANT the explicit typing, and maybe they also WANT the fix for timestamps <1000 ad (This was fixed, too). Jeez. Yesterday I had checked the patch and found 2 conflicts; as I checked out today, everything's a mess. Sigh. Patric >Hello Barry, > What happen if I call two or more statement with setFetchSize, btw. > 1. setFetchSize 10 for query on table a and table b > 2. select * from a > select 10 rows from a > 3. select * from b > select 10 rows from b > 4. I want rows 11-20 from a, now > 5. next 10 rows (11-20) from b >etc >in step 2 'execute' for query table a call 'BEGIN; DECLARE CURSOR ...' >in step 3 'execute' for query table b call again 'BEGIN; ...' >Can I do this without nested transaction (savepoint) >I think that close in ResultSet or Statement must call 'COMMIT' OR 'ROLLBACK', too >Can You set parametar in driver URL (true or false) that use cursor automagic for true ? >regards >Haris Peco >On Tuesday 04 February 2003 16:42, Barry Lind wrote: >> Paul, >> >> A patch was just applied to cvs head to better deal with this. The new >> behavior is that by default you get the old behavior, but if you call >> setFetchSize() it will turn on using cursors to incrementally fetch the >> result. >> >> thanks, >> --Barry >> >> PS. I haven't yet got a new build for the web site, but should in a few >> days, so in the meantime, if you want to try this out, you will need to >> build from cvs sources. >> >> Paul Cullum wrote: >> > I'm querying a table that has nearly 3 million records in it and when I >> > call executeQuery() I get an a java.lang.OutOfMemoryError message. My >> > immediate thought was that it was strange that a safe fetch size wasn't >> > used automatically so I then I decided to explicitly the fetch size by >> > calling the Connection classes setFetchSize() method. Apparently this >> > is not implemented in the pg73jdbc3.jar driver. I am surprised that >> > the default fetch size is so great as to cause an OutOfMemoryError. What >> > is the proper method for using JDBC to query result sets which are >> > potentially large? >> > >> > Thanks, >> > Paul >> > >> > >> > ---------------------------(end of broadcast)--------------------------- >> > TIP 5: Have you checked our extensive FAQ? >> > >> > http://www.postgresql.org/users-lounge/docs/faq.html >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 5: Have you checked our extensive FAQ? >> >> http://www.postgresql.org/users-lounge/docs/faq.html >---------------------------(end of broadcast)--------------------------- >TIP 6: Have you searched our list archives? >http://archives.postgresql.org PGP Public Key Fingerprint: 2636 F26E F523 7D62 4377 D206 7C68 06BB -----BEGIN PGP SIGNATURE----- Version: PGPsdk version 1.7.1 (C) 1997-1999 Network Associates, Inc. and its affiliated companies. iQA/AwUBPkAtYHxoBrvMu8qQEQLMIQCgkhLzbMkB/SfHztwzd3YTSn+mtRMAn2DC UU9LZVNnrOoOD+wJSDImwAo/ =2UCE -----END PGP SIGNATURE-----
Haris, The current code does a "BEGIN; DECLARE...". I intend to change this to no longer include the BEGIN and instead only allow the use of cursors if the driver is running in non-autocommit mode. Since postgres doesn't allow cursors to be used across transactions, a commit should also invalidate the result set. The current behavior of including the BEGINs will cause warning messages and thus the need for the change. However since postgres doesn't yet support nested transactions, there isn't a need for the commit/rollback you suggest. thanks, --Barry snpe wrote: > Hello Barry, > What happen if I call two or more statement with setFetchSize, btw. > > 1. setFetchSize 10 for query on table a and table b > 2. select * from a > select 10 rows from a > 3. select * from b > select 10 rows from b > 4. I want rows 11-20 from a, now > 5. next 10 rows (11-20) from b > > etc > > in step 2 'execute' for query table a call 'BEGIN; DECLARE CURSOR ...' > in step 3 'execute' for query table b call again 'BEGIN; ...' > Can I do this without nested transaction (savepoint) > I think that close in ResultSet or Statement must call 'COMMIT' OR 'ROLLBACK', too > > Can You set parametar in driver URL (true or false) that use cursor automagic for true ? > > regards > Haris Peco > > On Tuesday 04 February 2003 16:42, Barry Lind wrote: > >>Paul, >> >>A patch was just applied to cvs head to better deal with this. The new >>behavior is that by default you get the old behavior, but if you call >>setFetchSize() it will turn on using cursors to incrementally fetch the >>result. >> >>thanks, >>--Barry >> >>PS. I haven't yet got a new build for the web site, but should in a few >>days, so in the meantime, if you want to try this out, you will need to >>build from cvs sources. >> >>Paul Cullum wrote: >> >>>I'm querying a table that has nearly 3 million records in it and when I >>>call executeQuery() I get an a java.lang.OutOfMemoryError message. My >>>immediate thought was that it was strange that a safe fetch size wasn't >>>used automatically so I then I decided to explicitly the fetch size by >>>calling the Connection classes setFetchSize() method. Apparently this >>>is not implemented in the pg73jdbc3.jar driver. I am surprised that >>>the default fetch size is so great as to cause an OutOfMemoryError. What >>>is the proper method for using JDBC to query result sets which are >>>potentially large? >>> >>>Thanks, >>>Paul >>> >>> >>>---------------------------(end of broadcast)--------------------------- >>>TIP 5: Have you checked our extensive FAQ? >>> >>>http://www.postgresql.org/users-lounge/docs/faq.html >> >>---------------------------(end of broadcast)--------------------------- >>TIP 5: Have you checked our extensive FAQ? >> >>http://www.postgresql.org/users-lounge/docs/faq.html > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
Hello Barry, Does it mean : I can use two queries in parallel or not ? regards Haris Peco On Wednesday 05 February 2003 01:28, Barry Lind wrote: > Haris, > > The current code does a "BEGIN; DECLARE...". I intend to change this to > no longer include the BEGIN and instead only allow the use of cursors > if the driver is running in non-autocommit mode. Since postgres doesn't > allow cursors to be used across transactions, a commit should also > invalidate the result set. The current behavior of including the BEGINs > will cause warning messages and thus the need for the change. However > since postgres doesn't yet support nested transactions, there isn't a > need for the commit/rollback you suggest. > > > thanks, > --Barry > > snpe wrote: > > Hello Barry, > > What happen if I call two or more statement with setFetchSize, btw. > > > > 1. setFetchSize 10 for query on table a and table b > > 2. select * from a > > select 10 rows from a > > 3. select * from b > > select 10 rows from b > > 4. I want rows 11-20 from a, now > > 5. next 10 rows (11-20) from b > > > > etc > > > > in step 2 'execute' for query table a call 'BEGIN; DECLARE CURSOR ...' > > in step 3 'execute' for query table b call again 'BEGIN; ...' > > Can I do this without nested transaction (savepoint) > > I think that close in ResultSet or Statement must call 'COMMIT' OR > > 'ROLLBACK', too > > > > Can You set parametar in driver URL (true or false) that use cursor > > automagic for true ? > > > > regards > > Haris Peco > > > > On Tuesday 04 February 2003 16:42, Barry Lind wrote: > >>Paul, > >> > >>A patch was just applied to cvs head to better deal with this. The new > >>behavior is that by default you get the old behavior, but if you call > >>setFetchSize() it will turn on using cursors to incrementally fetch the > >>result. > >> > >>thanks, > >>--Barry > >> > >>PS. I haven't yet got a new build for the web site, but should in a few > >>days, so in the meantime, if you want to try this out, you will need to > >>build from cvs sources. > >> > >>Paul Cullum wrote: > >>>I'm querying a table that has nearly 3 million records in it and when I > >>>call executeQuery() I get an a java.lang.OutOfMemoryError message. My > >>>immediate thought was that it was strange that a safe fetch size wasn't > >>>used automatically so I then I decided to explicitly the fetch size by > >>>calling the Connection classes setFetchSize() method. Apparently this > >>>is not implemented in the pg73jdbc3.jar driver. I am surprised that > >>>the default fetch size is so great as to cause an OutOfMemoryError. What > >>>is the proper method for using JDBC to query result sets which are > >>>potentially large? > >>> > >>>Thanks, > >>>Paul > >>> > >>> > >>>---------------------------(end of broadcast)--------------------------- > >>>TIP 5: Have you checked our extensive FAQ? > >>> > >>>http://www.postgresql.org/users-lounge/docs/faq.html > >> > >>---------------------------(end of broadcast)--------------------------- > >>TIP 5: Have you checked our extensive FAQ? > >> > >>http://www.postgresql.org/users-lounge/docs/faq.html > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org
snpe wrote: > Hello Barry, > Does it mean : > I can use two queries in parallel or not ? Yes you can use two queries in parallel (just like you have always been able to), however they must be in the same transaction. --Barry
On Wednesday 05 February 2003 17:22, Barry Lind wrote: > snpe wrote: > > Hello Barry, > > Does it mean : > > I can use two queries in parallel or not ? > > Yes you can use two queries in parallel (just like you have always been > able to), however they must be in the same transaction. > > --Barry But, I can't commit or rollback, does it ? If I have to care for query, commit or rollback I can simple use declare cursor and I am sure what I do. I think that is only solution - cursor out of a transaction. It shouldn't be problem with MVC. Oracle have cursor out of a transaction and OLTP systems with tousend user. It isn't possible any multi database project without this features Thanks Barry Haris Peco
snpe wrote: > But, I can't commit or rollback, does it ? > If I have to care for query, commit or rollback I can simple use declare cursor > and I am sure what I do. > I think that is only solution - cursor out of a transaction. > It shouldn't be problem with MVC. > Oracle have cursor out of a transaction and OLTP systems with tousend user. > It isn't possible any multi database project without this features > > Thanks Barry > Haris Peco > The fact that cursors are constrained to a single transaction is not a jdbc issue. That is how postgresql works on the server. If you want that behavior changed, then this is the wrong email list to make those requests. Better yet, since this is open source code, the best way to get anything changed is to code the changes yourself and submit them back to the community. That being said, it is certainly within the jdbc spec for cursors not to extend across transactions. That is why the meta data classes have methods for the client to inquire if the server supports cursors across transactions. thanks, --Barry
On Wednesday 05 February 2003 19:21, Barry Lind wrote: > snpe wrote: > > But, I can't commit or rollback, does it ? > > If I have to care for query, commit or rollback I can simple use declare > > cursor and I am sure what I do. > > I think that is only solution - cursor out of a transaction. > > It shouldn't be problem with MVC. > > Oracle have cursor out of a transaction and OLTP systems with tousend > > user. It isn't possible any multi database project without this features > > > > Thanks Barry > > Haris Peco > > The fact that cursors are constrained to a single transaction is not a > jdbc issue. That is how postgresql works on the server. If you want > that behavior changed, then this is the wrong email list to make those > requests. Better yet, since this is open source code, the best way to > get anything changed is to code the changes yourself and submit them > back to the community. > > That being said, it is certainly within the jdbc spec for cursors not to > extend across transactions. That is why the meta data classes have > methods for the client to inquire if the server supports cursors across > transactions. > > thanks, > --Barry Hello Barry, I know that isn't jdbc issue and I don't know postgresql internal that I send patch for this - when I can do this that mena to much change in postgresql - in postgresql tansaction begin with select command (without change in database) and that is problem in cursors - I don't know why Thanks Haris Peco