Thread: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

From
Dave Crooke
Date:
I have followed the instructions below to no avail .... any thoughts?

http://jdbc.postgresql.org/documentation/83/query.html#query-with-cursor

This is what happens when I reduce the fetch_size to 50 ... stops after about 950msec and 120 fetches (6k rows) ....

13:59:56,054 [PerfDataMigrator] ERROR com.hyper9.storage.sample.persistence.PersistenceManager:3216 - Unexpected error while migrating sample data: 6000
org.postgresql.util.PSQLException: ERROR: portal "C_14" does not exist
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
    at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527)
    at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843)
    at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:169)
    at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:169)
    at com.hyper9.storage.sample.persistence.PersistenceManager$Migrator.run(PersistenceManager.java:3156)
    at java.lang.Thread.run(Thread.java:619)


Cheers
Dave


On Thu, Apr 15, 2010 at 2:42 PM, Dave Crooke <dcrooke@gmail.com> wrote:
Hey folks

I am trying to do a full table scan on a large table from Java, using a straightforward "select * from foo". I've run into these problems:

1. By default, the PG JDBC driver attempts to suck the entire result set into RAM, resulting in java.lang.OutOfMemoryError ... this is not cool, in fact I consider it a serious bug (even MySQL gets this right ;-) I am only testing with a 9GB result set, but production needs to scale to 200GB or more, so throwing hardware at is is not feasible.

2. I tried using the official taming method, namely java.sql.Statement.setFetchSize(1000) and this makes it blow up entirely with an error I have no context for, as follows (the number C_10 varies, e.g. C_12 last time) ...

org.postgresql.util.PSQLException: ERROR: portal "C_10" does not exist
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
    at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527)
    at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843)

This is definitely a bug :-)


Is there a known workaround for this ... will updating to a newer version of the driver fix this?

Is there a magic incation of JDBC calls that will tame it?

Can I cast the objects to PG specific types and access a hidden API to turn off this behaviour?

If the only workaround is to explicitly create a cursor in PG, is there a good example of how to do this from Java?

Cheers
Dave






Re: Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

From
Scott Carey
Date:
On Apr 15, 2010, at 1:01 PM, Dave Crooke wrote:
> On Thu, Apr 15, 2010 at 2:42 PM, Dave Crooke <dcrooke@gmail.com> wrote:
> Hey folks
>
> I am trying to do a full table scan on a large table from Java, using a straightforward "select * from foo". I've run
intothese problems: 
>
> 1. By default, the PG JDBC driver attempts to suck the entire result set into RAM, resulting in
java.lang.OutOfMemoryError... this is not cool, in fact I consider it a serious bug (even MySQL gets this right ;-) I
amonly testing with a 9GB result set, but production needs to scale to 200GB or more, so throwing hardware at is is not
feasible.
>

For scrolling large result sets you have to do the following to prevent it from loading the whole thing into memory:


Use forward-only, read-only result scrolling and set the fetch size.  Some of these may be the default depending on
whatthe connection pool is doing, but if set otherwise it may cause the whole result set to load into memory.  I
regularlyread several GB result sets with ~10K fetch size batches. 

Something like:
Statement st =  conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY)
st.setFetchSize(FETCH_SIZE);



> 2. I tried using the official taming method, namely java.sql.Statement.setFetchSize(1000) and this makes it blow up
entirelywith an error I have no context for, as follows (the number C_10 varies, e.g. C_12 last time) ...  
>
> org.postgresql.util.PSQLException: ERROR: portal "C_10" does not exist
>     at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
>     at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
>     at org.postgresql.core.v3.QueryExecutorImpl.fetch(QueryExecutorImpl.java:1527)
>     at org.postgresql.jdbc2.AbstractJdbc2ResultSet.next(AbstractJdbc2ResultSet.java:1843)
>
> This is definitely a bug :-)
>
>

I have no idea what that is.

> Is there a known workaround for this ... will updating to a newer version of the driver fix this?
>
> Is there a magic incation of JDBC calls that will tame it?
>
> Can I cast the objects to PG specific types and access a hidden API to turn off this behaviour?
>
> If the only workaround is to explicitly create a cursor in PG, is there a good example of how to do this from Java?
>
> Cheers
> Dave
>
>
>
>
>
>