Thread: OutOfMemory
Hi! I'm not sure whether this is a Java or an DB issue. I select a lot of data from database, more than would fit into memory - and get an OutOfMemoryException. Well, why is this? This is actually what the idea of an Iterator is about, that the data is progressively fetch and not all at once put into memory, isn't it? Now I do have to manually run the query multiple times using LIMIT/OFFSET (manually adapted to the amount of RAM of the host machine...). Timo
I'm seeing the following msg in my postgres log: pq_recvbuf: unexpected EOF on client connection I haven't been able to identify what is causing it, or how to recreate it. Anyone know where to start looking? I'm running 7.3.3. The application opens and closes it's own connections (no connection pooling) and runs under tomcat 5.0.18. Thanks... harry
Earlier versions of the PostgreSQL JDBC driver do not stream data from the back end: they fetch everything in one go. The PostgreSQL 7.4 JDBC driver supports JDBC's setFetchSize() operation, and will use PostgreSQL cursors internally. If you set the fetch size to something >0, it will correctly [*] and transparently stream data on demand. The driver works perfectly with earlier versions of PostgreSQL. With earlier versions of the driver, you can emulate the behaviour by first doing this: stmt.executeUpdate("declare foo cursor for select * from bar"); and then for each batch, as an executeQuery(): rs = stmt.executeQuery("fetch forward 200 from foo"); and when you're done with the cursor, stmt.executeUpdate("close foo"); [*] Beware of transactions with many queries. The JDBC driver never explicitly closes its cursors, and instead relies on the back end to close them when the transaction is committed or aborted. In my testing, the back end consistently runs out of memory in such cases. We are, however, talking about thousands of queries. For all I know this may have been fixed after the 7.4.1 release. Alexander. on 2004-03-29 17:42 postgres@nitwit.de wrote: > Hi! > > I'm not sure whether this is a Java or an DB issue. I select a lot of data > from database, more than would fit into memory - and get an > OutOfMemoryException. > > Well, why is this? This is actually what the idea of an Iterator is about, > that the data is progressively fetch and not all at once put into memory, > isn't it? > > Now I do have to manually run the query multiple times using LIMIT/OFFSET > (manually adapted to the amount of RAM of the host machine...). > > Timo > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
Am Montag, 29. März 2004 18:30 schrieb Alexander Staubo: > Earlier versions of the PostgreSQL JDBC driver do not stream data from > the back end: they fetch everything in one go. > > The PostgreSQL 7.4 JDBC driver supports JDBC's setFetchSize() > operation, and will use PostgreSQL cursors internally. If you set the > fetch size to something >0, it will correctly [*] and transparently > stream data on demand. The driver works perfectly with earlier > versions of PostgreSQL. > > With earlier versions of the driver, you can emulate the behaviour by > first doing this: > > stmt.executeUpdate("declare foo cursor for select * from bar"); > > and then for each batch, as an executeQuery(): > > rs = stmt.executeQuery("fetch forward 200 from foo"); > > and when you're done with the cursor, > > stmt.executeUpdate("close foo"); By chance i'am currently at the same point, unfortunately i don't get it working as expected. -after calling ResultSet.last() the getRow() still reflects the fetchsize (how to get the number of records for e.g. a progress-bar?) -calling "ResultSet.next()" at the "last" fetched record does not fetch more results automatically, Statement.fetchMoreResults() gives me null-pointer - how do i actually get the next fetch? -according to the documentation only "FETCH_FORWARD" is supported, which is not always suitable Does it have some meaning that it only works if Connection.setAutoCommit(false) is used? I had a quick look through the sources and found the term "server-prepared-statement" is used under certain conditions - what's this for? I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but this might lead to unexpected side effects if multiple users are changing data - the user gets only a momentary snapshot then, if the order has changed in between some records will never be seen, others twice and so on. Any nice idea to solve this? (Im using postgres 7.3 + a recent snapshot from cvs-jdbc-driver) Guido
Guido, One thing of note, is that for the driver to use a cursor based fetch, it must be inside a transaction, in other words setAutoCommit(false); Dave On Tue, 2004-03-30 at 07:16, Guido Fiala wrote: > Am Montag, 29. März 2004 18:30 schrieb Alexander Staubo: > > Earlier versions of the PostgreSQL JDBC driver do not stream data from > > the back end: they fetch everything in one go. > > > > The PostgreSQL 7.4 JDBC driver supports JDBC's setFetchSize() > > operation, and will use PostgreSQL cursors internally. If you set the > > fetch size to something >0, it will correctly [*] and transparently > > stream data on demand. The driver works perfectly with earlier > > versions of PostgreSQL. > > > > With earlier versions of the driver, you can emulate the behaviour by > > first doing this: > > > > stmt.executeUpdate("declare foo cursor for select * from bar"); > > > > and then for each batch, as an executeQuery(): > > > > rs = stmt.executeQuery("fetch forward 200 from foo"); > > > > and when you're done with the cursor, > > > > stmt.executeUpdate("close foo"); > > By chance i'am currently at the same point, unfortunately i don't get it > working as expected. > > -after calling ResultSet.last() the getRow() still reflects the fetchsize (how > to get the number of records for e.g. a progress-bar?) > > -calling "ResultSet.next()" at the "last" fetched record does not fetch more > results automatically, Statement.fetchMoreResults() gives me null-pointer - > how do i actually get the next fetch? > > -according to the documentation only "FETCH_FORWARD" is supported, which is > not always suitable > > Does it have some meaning that it only works if > Connection.setAutoCommit(false) is used? > I had a quick look through the sources and found the term > "server-prepared-statement" is used under certain conditions - what's this > for? > > I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but > this might lead to unexpected side effects if multiple users are changing > data - the user gets only a momentary snapshot then, if the order has changed > in between some records will never be seen, others twice and so on. > > Any nice idea to solve this? > > (Im using postgres 7.3 + a recent snapshot from cvs-jdbc-driver) > > Guido > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Dave Cramer 519 939 0336 ICQ # 14675561
Guido Fiala wrote: > By chance i'am currently at the same point, unfortunately i don't get it > working as expected. > > -after calling ResultSet.last() the getRow() still reflects the fetchsize (how > to get the number of records for e.g. a progress-bar?) That sounds like a bug; last() should take you to the very last record of the resultset independent of the fetchsize. Can you submit a testcase showing this? (... later ...) Hold on -- doesn't last() require a scrollable resultset, which means you're not using a cursor behind the scenes? I think the driver does not throw an exception if you try to use last()/absolute()/etc with FETCH_FORWARD_ONLY (it should!) but instead just gives you the wrong results.. not great, but this is not only the fault of the driver :) > -calling "ResultSet.next()" at the "last" fetched record does not fetch more > results automatically, Statement.fetchMoreResults() gives me null-pointer - > how do i actually get the next fetch? It should be transparent. i.e. last() should take you to the *very end* of the resultset, fetching forward until there are no more results. You shouldn't ever see the boundaries between fetches (other than as a slight delay on next()). > -according to the documentation only "FETCH_FORWARD" is supported, which is > not always suitable This is simply because no-one has implemented the logic to support scrollable resultsets backed by a cursor yet. Patches are welcome! > Does it have some meaning that it only works if > Connection.setAutoCommit(false) is used? Yes -- otherwise we'd need to use a cursor WITH HOLD and manage it more carefully. Currently we rely on end-of-transaction closing the cursors, and that scheme doesn't work with autocommit on so we don't use cursors in that case. This is actually a tradeoff between storing the resultset on the backend and storing it on the java process -- WITH HOLD is not free, someone has to store the data once the transaction is gone. I think the backend does a better job of paging out results etc though. > I had a quick look through the sources and found the term > "server-prepared-statement" is used under certain conditions - what's this > for? That's unrelated to cursor use; this is to do with transforming repeatedly executed queries into a PREPARE/EXECUTE form. It's not on by default. > I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but > this might lead to unexpected side effects if multiple users are changing > data - the user gets only a momentary snapshot then, if the order has changed > in between some records will never be seen, others twice and so on. > > Any nice idea to solve this? Wrap your queries in a transaction if you want transactional isolation. Also you really want an ORDER BY if you're using LIMIT/OFFSET. -O
Oliver Jowett wrote: >That sounds like a bug; last() should take you to the very last record >of the resultset independent of the fetchsize. Can you submit a testcase >showing this? >(... later ...) Hold on -- doesn't last() require a scrollable >resultset, which means you're not using a cursor behind the scenes? I >think the driver does not throw an exception if you try to use >last()/absolute()/etc with FETCH_FORWARD_ONLY (it should!) but instead >just gives you the wrong results.. not great, but this is not only the >fault of the driver :) Yeah - at first i simply called setFetchSize(20); after creating a Statement with conn.createStatement(ResultSet.SCROLL_SENSITIVE,...). If turned out to make no difference, the driver still fetched all results into memory, which isn't easy to notice unless one uses the debugger or fill in so many data that the OutOfMemory comes if fetchsize is not working. So i walked through the source tree and saw that it only would use the fetchsize by using "FETCH FORWARD ..." if the ResultSetType is set to FETCH_FORWARD_ONLY. This does indeed fetch only fetchsize rows but then i was stuck... so here is my TextCase: --- import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; public class TestFetchSize { //to be run with standard java-memory-settings: (-Xmx64m) public static void main(String[] args) { try { Class.forName("org.postgresql.Driver"); } catch (ClassNotFoundException e1) { e1.printStackTrace(); } try { //set-up: Connection conn = DriverManager.getConnection("jdbc:postgresql://<server>:5432/<db>", <user>,<password>); Statement ct=conn.createStatement(); try { ct.execute("DROP TABLE tst_fetchsize");} catch (SQLException x) {}; ct.execute("CREATE TABLE tst_fetchsize ( id bigserial primary key, a varchar );"); Statement cf=conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE); ResultSet ins=cf.executeQuery("SELECT * from tst_fetchsize LIMIT 0"); String data=new String(new byte[3000000]); data=data.replace('\0',' ');//very long and empty String System.out.println("string-length=" +data.length()); final int count=10; for(int i=0;i<count;i++) { ins.moveToInsertRow(); ins.updateString(2,data);//create very large records! ins.insertRow(); } //test forward-only System.out.println("test: TYPE_FORWARD_ONLY"); conn.setAutoCommit(false);//to allow cursors below Statement st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ_ONLY); st.setFetchSize(1); ResultSet rs=st.executeQuery("Select * from tst_fetchsize"); rs.last();//now we should be at "count" System.out.println("getRow=" + rs.getRow() + " should be=" + count); rs.beforeFirst(); while(rs.next()) System.out.println("this is row " + rs.getString(1)); System.out.println("now backward:"); while(rs.previous()) System.out.println("this is row " + rs.getString(1)); //test scroll* System.out.println("test: TYPE_SCROLL_SENSITIVE, it should not give an OutOfMemory-Exception if we fetch indeed just one record as above!"); conn.setAutoCommit(false);//to allow cursors below st=conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE,ResultSet.CONCUR_READ_ONLY); st.setFetchSize(1); rs=st.executeQuery("Select * from tst_fetchsize"); rs.last();//now we should be at "count" System.out.println("getRow=" + rs.getRow() + " should be=" + count +" and is, however, the driver already fetched all rows here!!!"); rs.beforeFirst(); while(rs.next()) System.out.println("this is row " + rs.getString(1)); System.out.println("now backward:"); while(rs.previous()) System.out.println("this is row " + rs.getString(1)); //cleanup: conn.setAutoCommit(true); ct.execute("DROP TABLE tst_fetchsize"); } catch (SQLException e) { e.printStackTrace(); } } } --- >Yes -- otherwise we'd need to use a cursor WITH HOLD and manage it more >carefully. Currently we rely on end-of-transaction closing the cursors, >and that scheme doesn't work with autocommit on so we don't use cursors >in that case. Ok, understood - the database holds all the result-data of a query in memory already. >This is actually a tradeoff between storing the resultset on the backend >and storing it on the java process -- WITH HOLD is not free, someone has >to store the data once the transaction is gone. I think the backend does >a better job of paging out results etc though. So it is - especially with Blobs, as JDBC needs 5 to 8 times the size for each blob because of 1. storing the whole result in a UNICODE-StreamBuf (including the bytea-part which is only ASCII) 2. quoting of half the characters of the blob with one (') to three (\) characters is required. Without using fetchsize the Java-memory is soon exhausted... Wouldn't it be possible to stream the bytea-parts at least directly into byte[]-arrays as they come in over the socket, somehow? (The problem is here that we don't know the size of the bytea, need to use a "ByteBuffer" which can grow...) >> I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but >> this might lead to unexpected side effects if multiple users are changing >> data - the user gets only a momentary snapshot then, if the order has changed >> in between some records will never be seen, others twice and so on. >> >> Any nice idea to solve this? >Wrap your queries in a transaction if you want transactional isolation. Mmm, need to test this - but if that works, why use CURSORS (which work also only within a transaction) in the driver and not the LIMIT/OFFSET instead? In that case, whether the frontend nor the backend would have to copy data (into memory) which will eventually never be fully read. >Also you really want an ORDER BY if you're using LIMIT/OFFSET. That was just an example to point out, that between multiple SELECTS something could change the OFFSET of the next data to be fetched... but snapshots are just snapshots and more than doing a "READ_COMMITTED" can't be done here.
Guido Fiala wrote: > Oliver Jowett wrote: > >>(... later ...) Hold on -- doesn't last() require a scrollable >>resultset, which means you're not using a cursor behind the scenes? I >>think the driver does not throw an exception if you try to use >>last()/absolute()/etc with FETCH_FORWARD_ONLY (it should!) but instead >>just gives you the wrong results.. not great, but this is not only the >>fault of the driver :) [...] > So i walked through the source tree and saw that it only would use the > fetchsize by using "FETCH FORWARD ..." if the ResultSetType is set to > FETCH_FORWARD_ONLY. > This does indeed fetch only fetchsize rows but then i was stuck... so here > is my TextCase: [...] > //test forward-only > System.out.println("test: TYPE_FORWARD_ONLY"); > conn.setAutoCommit(false);//to allow cursors below > Statement > st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ_ONLY); > st.setFetchSize(1); > ResultSet rs=st.executeQuery("Select * from tst_fetchsize"); > rs.last();//now we should be at "count" Indeed, here's your problem -- you're not allowed to call last() on a TYPE_FORWARD_ONLY resultset. The driver should throw an exception at this point (but currently doesn't). See the ResultSet.last() javadoc for details. -O
Am Donnerstag, 1. April 2004 09:14 schrieb Oliver Jowett: [...] > > st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ > >_ONLY); st.setFetchSize(1); > > ResultSet rs=st.executeQuery("Select * from tst_fetchsize"); > > rs.last();//now we should be at "count" > > Indeed, here's your problem -- you're not allowed to call last() on a > TYPE_FORWARD_ONLY resultset. The driver should throw an exception at > this point (but currently doesn't). See the ResultSet.last() javadoc for > details. Ok - that's one part, but why does the second part of my testcase not work as expected - means fetchsize has no effect ?
Guido Fiala wrote: > Am Donnerstag, 1. April 2004 09:14 schrieb Oliver Jowett: > [...] > >>>st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ >>>_ONLY); st.setFetchSize(1); >>> ResultSet rs=st.executeQuery("Select * from tst_fetchsize"); >>> rs.last();//now we should be at "count" >> >>Indeed, here's your problem -- you're not allowed to call last() on a >>TYPE_FORWARD_ONLY resultset. The driver should throw an exception at >>this point (but currently doesn't). See the ResultSet.last() javadoc for >>details. > > > Ok - that's one part, but why does the second part of my testcase not work as > expected - means fetchsize has no effect ? This is the part where you use TYPE_SCROLL_SENSITIVE + setFetchsize() and see the driver pulling the whole resultset into memory at once? setFetchsize() is a hint. As no-one has implemented cursor-backed support for scrollable resultsets yet, the driver ignores the hint and fetches the whole result set anyway. Patches to improve this behaviour are, of course, welcome :) -O