Inserting large BLOBs via JDBC - OutOfMemoryError - Mailing list pgsql-jdbc

From holger.haag@gmx.de
Subject Inserting large BLOBs via JDBC - OutOfMemoryError
Date
Msg-id 17295.1029415705@www55.gmx.net
Whole thread Raw
List pgsql-jdbc
Hi all,

I am facing a problem when writing blobs via jdbc.

Small BLOBs work, but with BLOBs of a certain size my code throws a
java.lang.OutOfMemoryError.
The file I tried has about 2-3 MB.


Environment:
- PostgreSQL 7.2 on i686-pc-linux-gnu, compiled by GCC 2.96
- Mandrake 8.2
- 192 MB RAM
- blob column created as datatype "bytea" (maybe that is the problem)
- code posted below (BlobPostgresTest.java)
- console output posted below
- java command to run program:
    java -verbose:gc -Xmx128m -classpath .:pgjdbc2.jar BlobPostgresTest
- java version as follows from "java -version" command:
    java version "1.4.0_01"
    Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0_01-b03)
    Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)


To me it seems that the implementation of setBinaryStream() could be
improved.
Details follow below.

let me know what you think.
holger



Console output (including GC activities):
=========================================

java -verbose:gc -Xmx128m -classpath .:pgjdbc2.jar BlobPostgresTest

BLOB/PostgreSQL Demo started
[GC 511K->230K(1984K), 0.0163708 secs]
preparing statement
setting binary stream
[GC 402K->273K(1984K), 0.0092325 secs]
[Full GC 273K->273K(1984K), 0.0771294 secs]
[GC 4351K->4058K(4908K), 0.0046329 secs]
[Full GC 4058K->3770K(4908K), 0.0831070 secs]
[GC 4922K->4922K(6860K), 0.0009556 secs]
[Full GC 4922K->4346K(6860K), 0.0890570 secs]
[GC 6650K->6650K(7820K), 0.0011110 secs]
[Full GC 6650K->5496K(7820K), 0.1420860 secs]
[GC 10104K->10104K(14480K), 0.0012827 secs]
[Full GC 10104K->7800K(14480K), 0.1236744 secs]
[GC 17016K->17016K(24084K), 0.0015421 secs]
[Full GC 17016K->12408K(24084K), 0.1688843 secs]
[GC 30840K->30840K(43224K), 0.0020800 secs]
[Full GC 30840K->21624K(43224K), 0.2547274 secs]
[GC 76920K->74616K(81500K), 0.0041685 secs]
[Full GC 74616K->49272K(81500K), 0.5688448 secs]
[GC 67704K->67704K(88332K), 0.0033407 secs]
[Full GC 67704K->58488K(88332K), 0.2558231 secs]
executing update
[GC 95352K->95352K(104844K), 0.0932741 secs]
[Full GC 95352K->40056K(104844K), 0.9644251 secs]
[GC 69245K->69245K(104844K), 0.0036631 secs]
[Full GC 69245K->69245K(104844K), 0.0814962 secs]
[Full GC 69245K->66324K(129728K), 1.1439123 secs]
Exception in thread "main" java.lang.OutOfMemoryError


as you can see the program stops when setBinaryStream() is executed.
the memory allocated is exceeding 128 MB (the whole thing fails with a max.
heap of 192MB as well).

-> my assumption is that setBinaryStream() is allocating a huge amount of
memory

looking into the PostreSQL source code, I find

org.postgresql.jdbc1.AbstractJdbc1Statement#setBinaryStream
org.postgresql.util.PGbytea#toPGString

mainly toPGString seems to be programmed inefficiently in terms of memory
consumption, e.g.

the string buffer allocation

    public static String toPGString(byte[] p_buf) throws SQLException
    {
        if (p_buf == null)
            return null;
        StringBuffer l_strbuf = new StringBuffer();


should contain a senseful inital size


    public static String toPGString(byte[] p_buf) throws SQLException
    {
        if (p_buf == null)
            return null;
        StringBuffer l_strbuf = new StringBuffer(p_buf.length);

or even

        StringBuffer l_strbuf = new StringBuffer(p_buf.length*2);

because of special characters


to avoid a repeated re-allocation of the internal char[] array in the
stringbuffer.



BlobPostgresTest.java
=====================

import java.sql.*;
import java.io.*;

/*

drop table blobdemo;

drop sequence blobdemo_id_seq;

create table blobdemo(
  id serial not null primary key,
  name varchar(50),
  content bytea);

*/

public final class BlobPostgresTest {
    private final static String NAME = "TEST";
    //private final static String FILE_NAME = "/tmp/blob/2mb.xxx";
    private final static String FILE_NAME = "BlobPostgresTest.java";

    public final static void main(String[] args) throws Throwable {
    Connection con = null;
    Statement statement = null;
    PreparedStatement insertStatement = null;
    ResultSet rs = null;
    File file = null;
    FileInputStream fis = null;
    BufferedInputStream bis = null;

    try {
        System.out.println("BLOB/PostgreSQL Demo started");

        Class.forName("org.postgresql.Driver");

        con = DriverManager.getConnection
        ("jdbc:postgresql://localhost/template1",
         "postgres", "");
        con.setAutoCommit(true);

        statement = con.createStatement();

        // cleanup
        statement.executeUpdate("delete from blobdemo");

        // file
        file = new File(FILE_NAME);
        fis = new FileInputStream(file);
        bis = new BufferedInputStream(fis);

        // insert one record
        System.out.println("preparing statement");
        insertStatement = con.prepareStatement
        ("insert into blobdemo ( name, content ) values ( ?, ? )");

        insertStatement.setString(1, NAME);

        System.out.println("setting binary stream");
        insertStatement.setBinaryStream(2, bis, (int)file.length());

        System.out.println("executing update");
        insertStatement.executeUpdate();

        // retrieve
        rs = statement.executeQuery
        ("select id, name, content from blobdemo");

        while(rs.next()) {
        System.out.println("id=" + rs.getObject(1));
        System.out.println("name=" + rs.getObject(2));

        byte[] bytes = rs.getBytes(3);
        String content = new String(bytes);

        //System.out.println(content);
        System.out.println("retrieved " + bytes.length +  " bytes");
        }
    } finally {

        if(rs != null) rs.close();
        if(statement != null) statement.close();
        if(insertStatement != null) insertStatement.close();
        if(con != null) con.close();

        if(fis != null) fis.close();
        if(bis != null) bis.close();
    }

    System.out.println("BLOB/PostgreSQL Demo complete");
    }
}

--
GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net


pgsql-jdbc by date:

Previous
From: Alejandro Vergara
Date:
Subject: Question about JDBC?
Next
From: Chad
Date:
Subject: Inquiry From Form [pgsql]