Thread: getBinaryStream and OutOfMemoryException
Hi I have a problem when retreiving relative large files from a bytea field in the database using getBinaryStream. ResultSet res.... InputStream is = res.getBinaryStream("largefile_bytea"); causes OutOfMemoryException... setBinaryStream also used to cause this problem but seems to be fixed with the 8.0-release of the jdbc driver ( thanks!!! :-) ). Whenever i use getBinaryStream on a file of size 9mb or larger it runs out of memory. I know i can just raise jvm heap using -Xmx, but it seems to me like the streaming feature doesn't work properly. Looking into the cvs checkout from 22-05-2005, it really seems to me that getBinaryStream is nothing more than a getBytes ...?? Searching the mailing list archive didn't give me the answer, so i hope someone out there is able to provide it. My setup is; - postgresql-8.0-311.jdbc3 driver - postgresql 8.0 - jboss/tomcat using hibernate - running on windows xp with jdk 1.5.0_02-b09 Regards Jesper Thorhauge
Jesper Thorhauge wrote: > Looking into the cvs > checkout from > 22-05-2005, it really seems to me that getBinaryStream is nothing more > than a > getBytes ...?? Correct. The current frontend/backend protocol doesn't really let us do this differently, at least for bytea columns. It could be made more memory-efficient (e.g. by using the binary result format rather than the current text format) but the driver still has to pull the entire column value across the wire at the time the row is returned. Another approach would be to write large column values to disk, but noone's written the code to do that yet; it also has some other problems such as: what if you are running in an environment where you don't have access to the disk? -O
Okay, thanks alot for the quick answer! I was wondering, do you know how much more efficient it could be done using the binary format you are mentioning? I'm especially interested in decreasing the amount of memory used to get large bytea values... /Jesper Oliver Jowett wrote: >Jesper Thorhauge wrote: > > >>Looking into the cvs >>checkout from >>22-05-2005, it really seems to me that getBinaryStream is nothing more >>than a >>getBytes ...?? >> >> > >Correct. The current frontend/backend protocol doesn't really let us do >this differently, at least for bytea columns. It could be made more >memory-efficient (e.g. by using the binary result format rather than the >current text format) but the driver still has to pull the entire column >value across the wire at the time the row is returned. > >Another approach would be to write large column values to disk, but >noone's written the code to do that yet; it also has some other problems >such as: what if you are running in an environment where you don't have >access to the disk? > >-O > >
Jesper Thorhauge wrote: > Okay, thanks alot for the quick answer! I was wondering, do you know how > much more efficient it could be done using the binary format you are > mentioning? I'm especially interested in decreasing the amount of memory > used to get large bytea values... Um, rough estimate would be about a factor of 5 (assume text representation of bytea averages about ~2.5 characters per raw byte, and characters are 16 bits on the Java side) -O
Okay, thats seems like a lot!! What are the priorities about getting this modification into the driver? I have seen some other posts regarding this binaryStream / memory problem, so it seems that i'm not the only one :-) Oliver Jowett wrote: >Jesper Thorhauge wrote: > > >>Okay, thanks alot for the quick answer! I was wondering, do you know how >>much more efficient it could be done using the binary format you are >>mentioning? I'm especially interested in decreasing the amount of memory >>used to get large bytea values... >> >> > >Um, rough estimate would be about a factor of 5 (assume text >representation of bytea averages about ~2.5 characters per raw byte, and >characters are 16 bits on the Java side) > >-O > >---------------------------(end of broadcast)--------------------------- >TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > >
Jesper Thorhauge wrote: > Okay, thats seems like a lot!! What are the priorities about getting > this modification into the driver? I have seen some other posts > regarding this binaryStream / memory problem, so it seems that i'm not > the only one :-) There's no real priority; it'll happen when someone with time and motivation implements it.. It's not trivial to do as the driver doesn't know ahead of time which columns are bytea -- so either it must do an extra network roundtrip to enquire about column types, or be ready to handle binary-format results for *all* types. -O