Re: ps.setCharacterStream() and memory usage - Mailing list pgsql-jdbc

From Vadim Nasardinov
Subject Re: ps.setCharacterStream() and memory usage
Date
Msg-id 200411020946.09434@vadim.nasardinov
Whole thread Raw
In response to ps.setCharacterStream() and memory usage  (Sebastiaan van Erk <sebster@sebster.com>)
Responses Re: ps.setCharacterStream() and memory usage
List pgsql-jdbc
On Friday 29 October 2004 07:56, Sebastiaan van Erk wrote:
> Coming back to another problem I have with the following insert
> statement: (Postgres 7.4, driver build 215, jdbc 3)
>
>    ps.setCharacterStream(1, reader, (int) messageFile.length());
>    ps.executeUpdate();
>
> The reason I do this (using a character stream to load a text file
> to a TEXT field) is that I wish to avoid loading the file into
> memory completely.
[...]
> [...] the file actually gets copied THREE times into memory, causing
> my Java file to already get a java.lang.OutOfMemory error with an
> insert of a 10M text file (Java allocates 20M for this, since it has
> 2 byte chars, and the driver makes (at least) 3 copies (referenced
> at the same time, not allowing one to be GC'ed), making 60M).

I assume you're talking about
org/postgresql/jdbc2/AbstractJdbc2Statement.java which has this piece
of code (slightly reformatted to fit on a 80-column page):


    public void setCharacterStream(int i, Reader x, int length)
        throws SQLException {

        checkClosed();
        if (length < 0)
            throw new PSQLException(GT.tr("Invalid stream length {0}.",
                                          new Integer(length)));

        if (connection.haveMinimumCompatibleVersion("7.2")) {
            char[] l_chars = new char[length];
            int l_charsRead = 0;
            try {
                while(true) {
                    int n = x.read(l_chars,
                                   l_charsRead,
                                   length - l_charsRead);
                    if (n == -1)
                        break;
                    l_charsRead += n;
                    if (l_charsRead == length)
                        break;
                }
            } catch (IOException l_ioe) {
                throw new PSQLException(GT.tr("Provided Reader failed."),
                                        PSQLState.UNEXPECTED_ERROR, l_ioe);
            }
            setString(i, new String(l_chars, 0, l_charsRead));
        } else {

        }
    }


This does appear to be optimized for reading smallish chunks of text.
If the character source behind the Reader is sufficiently large, I
believe we'd be better off building up a StringBuffer rather than a
character array.  Something like this:

    if (connection.haveMinimumCompatibleVersion("7.2")) {
        StringBuffer sb = new StringBuffer(length);
        BufferedReader br = new BufferedReader(x);
        try {
            while(true) {
                String chunk = br.readLine();
                if (chunk == null) { break; }
                else {sb.append(chunk); }
            }
        } catch (IOException l_ioe) {
            throw new PSQLException(GT.tr("Provided Reader failed."),
                                    PSQLState.UNEXPECTED_ERROR, l_ioe);
        }
        setString(i, sb.toString());
    }

The reason this is better is because

  (a) the String(char[],int,int) constructor always defensively copies
      the passed in character array.  So, as you point out, if you
      pass in a 10-million-character array, a new 10-million character
      array will be allocated.

  (b) in contrast, if you construct a String from a StringBuffer, they
      can share the underlying character array:
      http://www.google.com/search?q=Heinz+Kabutz+StringBuffer+Issue+068&btnI



pgsql-jdbc by date:

Previous
From: Kris Jurka
Date:
Subject: Re: JDeveloper
Next
From: Alan Stange
Date:
Subject: executeBatch() issue with new driver?