Thread: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

From
Ankit Kumar
Date:
Hi

I am running a Criteria.scroll() on postgresql on a DB containing 2M
records. The memory keeps increasing and finally it generates an
OutOfMemoryException. Please can you advice how to fix this.

*Postgresql DB version:* 8.4
*Postgresql Driver Used:* postgresql-8.4-701.jdbc4.jar

Some forums do mention that "/scroll/" is slow on Postgresql.

Is this a known issue or we need to do something specific to make scroll
work on Postgresql?


*Exception StackTrace*:
java.lang.OutOfMemoryError: Java heap space
 at java.lang.Class.getDeclaredFields0(Native Method)
 at java.lang.Class.privateGetDeclaredFields(Class.java:2291)
 at java.lang.Class.getDeclaredField(Class.java:1880)
 at

java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.<init>(AtomicReferenceFieldUpdater.java:181)
 at
java.util.concurrent.atomic.AtomicReferenceFieldUpdater.newUpdater(AtomicReferenceFieldUpdater.java:65)
 at java.sql.SQLException.<clinit>(SQLException.java:353)
 at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1777)
 at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
 at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:479)
 at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:367)
 at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:271)
 at
org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:93)
 at
org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:186)
 at org.hibernate.loader.Loader.getResultSet(Loader.java:1787)
 at org.hibernate.loader.Loader.scroll(Loader.java:2286)
 at
org.hibernate.loader.criteria.CriteriaLoader.scroll(CriteriaLoader.java:89)
 at org.hibernate.impl.SessionImpl.scroll(SessionImpl.java:1533)
 at org.hibernate.impl.CriteriaImpl.scroll(CriteriaImpl.java:297)
 at org.hibernate.impl.CriteriaImpl.scroll(CriteriaImpl.java:291)
 at
com.hi.openname.dao.impl.HibernateKnowledgeItemDao.loadAllCommitted(HibernateKnowledgeItemDao.java:104)
 at
com.hi.openname.dao.impl.HibernateKnowledgeItemDao.loadAllCommitted(HibernateKnowledgeItemDao.java:1)

Regards
Ankit

Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

From
Greg Stark
Date:
On Mon, Dec 21, 2009 at 12:54 PM, Ankit Kumar <ankitk@xebia.com> wrote:
> I am running a Criteria.scroll() on postgresql on a DB containing 2M
> records. The memory keeps increasing and finally it generates an
> OutOfMemoryException. Please can you advice how to fix this.
>
> Postgresql DB version: 8.4
> Postgresql Driver Used: postgresql-8.4-701.jdbc4.jar
>
> Some forums do mention that "scroll" is slow on Postgresql.
>
> Is this a known issue or we need to do something specific to make scroll
> work on Postgresql?
>
>
> Exception StackTrace:
> java.lang.OutOfMemoryError: Java heap space
>

This looks like a Java error, not a Postgres one. It looks like
Hibernate is asking for the entire result set back from Postgres
instead of using a scrollable cursor and is trying to hold it all in
RAM. You should ask the Hibernate people if they plan to either spill
to disk or use database-side cursors which certainly do if necessary.

--
greg

Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

From
Ankit Kumar
Date:
Hi Greg

Thanks for your response. Hibernate works well when I change the DB to
SQL server but somehow the moment I point to Postgresql it start
generating OutOfMemory. Is there some configuration at DB end to ensure
it starts using the cursors.

See link: http://forum.springsource.org/archive/index.php/t-23170.html

Regards
Ankit

Greg Stark wrote:
> On Mon, Dec 21, 2009 at 12:54 PM, Ankit Kumar <ankitk@xebia.com> wrote:
>
>> I am running a Criteria.scroll() on postgresql on a DB containing 2M
>> records. The memory keeps increasing and finally it generates an
>> OutOfMemoryException. Please can you advice how to fix this.
>>
>> Postgresql DB version: 8.4
>> Postgresql Driver Used: postgresql-8.4-701.jdbc4.jar
>>
>> Some forums do mention that "scroll" is slow on Postgresql.
>>
>> Is this a known issue or we need to do something specific to make scroll
>> work on Postgresql?
>>
>>
>> Exception StackTrace:
>> java.lang.OutOfMemoryError: Java heap space
>>
>>
>
> This looks like a Java error, not a Postgres one. It looks like
> Hibernate is asking for the entire result set back from Postgres
> instead of using a scrollable cursor and is trying to hold it all in
> RAM. You should ask the Hibernate people if they plan to either spill
> to disk or use database-side cursors which certainly do if necessary.
>
>

Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

From
Greg Stark
Date:
On Mon, Dec 21, 2009 at 5:04 PM, Ankit Kumar <ankitk@xebia.com> wrote:
> Thanks for your response. Hibernate works well when I change the DB to SQL
> server but somehow the moment I point to Postgresql it start generating
> OutOfMemory. Is there some configuration at DB end to ensure it starts using
> the cursors.

No Postgresql is not generating these errors -- they're Java errors
and Postgresql is not written in Java. You'll get better luck starting
with the top and working down. The Hibernate people -- I don't know
what their mailing list is -- can answer questions about what
Hibernate does differently for MSSQL versus Postgres. If they say it's
doing the same thing then once you're clear on what that is that it's
doing you can speak to the JDBC Postgresql driver people
(pgsql-jdbc@postgresql.org) and ask whether they handle that case
poorly. But if you start with the jdbc people and you can't answer
questions about what Hibernate is doing then they're not going to know
how to help (unless they've seen it before I guess).


--
greg

Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

From
Craig Ringer
Date:
On 22/12/2009 1:25 AM, Greg Stark wrote:
> On Mon, Dec 21, 2009 at 5:04 PM, Ankit Kumar<ankitk@xebia.com>  wrote:
>> Thanks for your response. Hibernate works well when I change the DB to SQL
>> server but somehow the moment I point to Postgresql it start generating
>> OutOfMemory. Is there some configuration at DB end to ensure it starts using
>> the cursors.
>
> No Postgresql is not generating these errors -- they're Java errors
> and Postgresql is not written in Java.

However, PostgreSQL's JDBC driver *is*.

OP: I suggest following this up on the pgsql-jdbc list, where you're
likely to get people who work with these tools regularly and may be
better equipped to advise you.

> The Hibernate people -- I don't know
> what their mailing list is

They don't have one. There are forums at http://forums.hibernate.com/
but don't expect much. It's not a particularly helpful or friendly
community - lots of people asking questions, few people helping out or
answering them. I blame painful to use, crappy web forums.

You could always try http://stackoverflow.com/ .

> -- can answer questions about what
> Hibernate does differently for MSSQL versus Postgres.

In general it tries to behave almost exactly the same between different
DBs. Java folks (as a broad group) seem to be big believers in "the
database is a commodity product, they should all look and behave exactly
the same"*. If they can avoid admitting to a difference between two
databases, they will.

There are a few differences in SQL dialect handling, but overall
behaviour is VERY similar. It expects the JDBC driver to do the work of
making the database look like the standard JDBC interface except for SQL
dialect quirks. This usually even works.

Hibernate, like Pg, expects the session (transaction) to be aborted and
re-tried from the start if things go wrong, and is generally a good
conceptual fit for how Pg does things. So it generally gets away with
this attitude.

Pg's JDBC driver isn't exactly fully to spec, though, so there are
issues where Hibernate gets different behaviour than the spec requires
and chokes on it. That's not a *bug* in the Pg JDBC driver per se as it
doesn't claim to be fully JDBC3/4 compliant, but it's problematic
nonetheless.



* Different features and trade-off choices for different needs, you say?
Never! Let's code to the lowest common denominator and do all that
complicated "join" stuff in Java by fetching the raw tables then
filtering and processing them client-side. Hmm, actually, I'm sure I
have my nice, standard file open/seek/close APIs close to hand, I should
use throw out all this unnecessarily complicated "relational database"
stuff that I have to fight so hard to get to store my data in the
dumbest way possible - I'll just build an adapter in a java midlayer to
do what I need.

(sigh). Thankfully Hibernate isn't this dumb, but lots of ORMs try very
hard to be. Even with Hiberante, attempts to actually (gasp) use a
database's features are often met with "that's not portable between
databases, so you shouldn't want to do it" or dark mutterings about
"legacy", which is the Java word for "not completely universal (yet),"
"innovative and interesting" or "a solution to a problem that wasn't
written in Java".

Bitter? Me? No, really?

> If they say it's
> doing the same thing then once you're clear on what that is that it's
> doing you can speak to the JDBC Postgresql driver people
> (pgsql-jdbc@postgresql.org) and ask whether they handle that case
> poorly. But if you start with the jdbc people and you can't answer
> questions about what Hibernate is doing then they're not going to know
> how to help (unless they've seen it before I guess).

In this case, before posting to the pgsql-jdbc list you should turn on
'log_statement = all' in the Pg server config, and enable debug-level
logging of the whole 'org.hibernate' tree in your slf4j (via log4j or
whatever) config.

Then re-test. Record what SQL is actually being sent to the server (as
seen in the server logs) and what Hibernate is doing as per its logs.
Upload those logs to a suitable pastebin site or gzip them and post them
somewhere people can get to them - do *NOT* attach them to a message to
the mailing list.

Finally, make up a COMPLETELY SELF CONTAINED TEST CASE - with ".sql"
database dump or creation script, plus complete and compilable
stand-alone Java code to demonstrate the problem. If you can't hand out
your data, write something that generates dummy data that'll reproduce
the problem.

Read the logs, take note of any bits that look informative, and show
them as excerpts in your message.

*then* post on the pgsql-jdbc list and see if anyone can offer advice.

--
Craig Ringer