Re: bytea memory improvement - Mailing list pgsql-jdbc

From Luis Vilar Flores
Subject Re: bytea memory improvement
Date
Msg-id 44EB8C82.1000203@evolute.pt
Whole thread Raw
In response to bytea memory improvement  (Luis Vilar Flores <lflores@evolute.pt>)
Responses Re: bytea memory improvement  (till toenges <tt@kyon.de>)
Re: bytea memory improvement  (Kris Jurka <books@ejurka.com>)
List pgsql-jdbc
    Hello,

    This time I believe to have all tests and source needed to have the
patch accepted.
    To all that already forgot the first emails, I  developed  an
modified version of the  method toBytes from the
org.postgresql.util.PGbytea class.
    The old method uses 3 buffers to translate the data from the nework
to the client, this uses too much memory.
    My method only uses 2 buffers, but does one more pass through the
original buffer (to calculate it's final size).

    Bellow is a table with times and memory usage of the 2 methods,
using the supplied ByteaTest class:
OLD method:
size: 0.5MB execute+next: 49ms getBytes: 18ms used mem: 74505KB
size: 1.5MB execute+next: 94ms getBytes: 53ms used mem: 48004KB
size: 2.5MB execute+next: 147ms getBytes: 110ms used mem: 23537KB
size: 3.5MB execute+next: 244ms getBytes: 190ms used mem: 24504KB
size: 4.5MB execute+next: 306ms getBytes: 224ms used mem: 31448KB
size: 5.5MB execute+next: 364ms getBytes: 267ms used mem: 38392KB
size: 6.5MB execute+next: 413ms getBytes: 308ms used mem: 45336KB
size: 7.5MB execute+next: 464ms getBytes: 306ms used mem: 52281KB
size: 8.5MB execute+next: 511ms getBytes: 349ms used mem: 59225KB
size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB
size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB
size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB
size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB
size: 13.5MB execute+next: 820ms getBytes: 514ms used mem: 93945KB
size: 14.5MB execute+next: 865ms getBytes: 734ms used mem: 100888KB
size: 15.5MB execute+next: 921ms getBytes: 586ms used mem: 107833KB
size: 16.5MB execute+next: 1003ms getBytes: 619ms used mem: 114777KB
size: 17.5MB execute+next: 1030ms getBytes: 652ms used mem: 121721KB
size: 18.5MB execute+next: 1102ms getBytes: 927ms used mem: 128664KB
size: 19.5MB execute+next: 1166ms getBytes: 723ms used mem: 135609KB
size: 20.5MB execute+next: 1217ms getBytes: 735ms used mem: 142583KB
size: 21.5MB execute+next: 1284ms getBytes: 766ms used mem: 149527KB
size: 22.5MB execute+next: 1437ms getBytes: 801ms used mem: 156471KB
size: 23.5MB execute+next: 1425ms getBytes: 833ms used mem: 163415KB
size: 24.5MB execute+next: 1453ms getBytes: 866ms used mem: 170359KB
size: 25.5MB execute+next: 1766ms getBytes: 902ms used mem: 177303KB
size: 26.5MB execute+next: 2004ms getBytes: 939ms used mem: 184247KB
size: 27.5MB execute+next: 1650ms getBytes: 968ms used mem: 191191KB
size: 28.5MB execute+next: 1757ms getBytes: 796ms used mem: 198105KB
size: 29.5MB execute+next: 1770ms getBytes: 1040ms used mem: 205086KB
size: 30.5MB execute+next: 1820ms getBytes: 1074ms used mem: 212030KB
size: 31.5MB execute+next: 1869ms getBytes: 1109ms used mem: 218974KB
size: 32.5MB execute+next: 1930ms getBytes: 1146ms used mem: 225918KB
size: 33.5MB execute+next: 2183ms getBytes: 1177ms used mem: 232862KB
size: 34.5MB execute+next: 2241ms getBytes: 1221ms used mem: 239806KB

NEW method:
size: 0.5MB execute+next: 50ms getBytes: 19ms used mem: 73137KB
size: 1.5MB execute+next: 90ms getBytes: 50ms used mem: 43760KB
size: 2.5MB execute+next: 149ms getBytes: 97ms used mem: 16136KB
size: 3.5MB execute+next: 237ms getBytes: 113ms used mem: 14170KB
size: 4.5MB execute+next: 302ms getBytes: 174ms used mem: 18127KB
size: 5.5MB execute+next: 357ms getBytes: 234ms used mem: 22110KB
size: 6.5MB execute+next: 602ms getBytes: 232ms used mem: 26095KB
size: 7.5MB execute+next: 477ms getBytes: 265ms used mem: 30079KB
size: 8.5MB execute+next: 532ms getBytes: 296ms used mem: 34063KB
size: 9.5MB execute+next: 590ms getBytes: 385ms used mem: 38046KB
size: 10.5MB execute+next: 648ms getBytes: 357ms used mem: 42031KB
size: 11.5MB execute+next: 695ms getBytes: 391ms used mem: 46015KB
size: 12.5MB execute+next: 765ms getBytes: 423ms used mem: 49999KB
size: 13.5MB execute+next: 825ms getBytes: 542ms used mem: 53982KB
size: 14.5MB execute+next: 874ms getBytes: 491ms used mem: 57967KB
size: 15.5MB execute+next: 931ms getBytes: 521ms used mem: 61951KB
size: 16.5MB execute+next: 992ms getBytes: 551ms used mem: 65935KB
size: 17.5MB execute+next: 1063ms getBytes: 694ms used mem: 69918KB
size: 18.5MB execute+next: 1111ms getBytes: 618ms used mem: 73903KB
size: 19.5MB execute+next: 1168ms getBytes: 649ms used mem: 77887KB
size: 20.5MB execute+next: 1230ms getBytes: 654ms used mem: 81903KB
size: 21.5MB execute+next: 1289ms getBytes: 687ms used mem: 85890KB
size: 22.5MB execute+next: 1345ms getBytes: 737ms used mem: 89875KB
size: 23.5MB execute+next: 1415ms getBytes: 751ms used mem: 93861KB
size: 24.5MB execute+next: 1461ms getBytes: 782ms used mem: 97846KB
size: 25.5MB execute+next: 1521ms getBytes: 817ms used mem: 101833KB
size: 26.5MB execute+next: 1587ms getBytes: 848ms used mem: 105817KB
size: 27.5MB execute+next: 1634ms getBytes: 877ms used mem: 109804KB
size: 28.5MB execute+next: 1692ms getBytes: 931ms used mem: 113789KB
size: 29.5MB execute+next: 1748ms getBytes: 944ms used mem: 117775KB
size: 30.5MB execute+next: 1820ms getBytes: 972ms used mem: 121760KB
size: 31.5MB execute+next: 1869ms getBytes: 1005ms used mem: 125747KB
size: 32.5MB execute+next: 1915ms getBytes: 1038ms used mem: 129731KB
size: 33.5MB execute+next: 1983ms getBytes: 1088ms used mem: 133718KB
size: 34.5MB execute+next: 2055ms getBytes: 1103ms used mem: 137703KB

    As you can see the execution time remained almost the same (small
gain on the new version), but memory usage is drastically improved.

    These times were obtained in a Celeron M 1.6GHz laptop with 1GB RAM,
running Fedora Core 5, Java 1.5.0_08 and Postgresql 8.1.4.

    In attach I supply the modified PGbytea.java, the patch versus
8.1-407 source (the 8.2dev-503 is the same), and the test program
ByteaTest.java.

    The test program also validates the correctness of the result
through CRC32.

    Hope to hear some feedback soon, hope I didn't forget anything ...

       Luis Flores

/*
 * ByteaTest.java
 *
 * Created on August 22, 2006, 8:09 PM
 *
 * To change this template, choose Tools | Template Manager
 * and open the template in the editor.
 */

package com.evolute.postgresql.test;

import java.sql.*;
import java.util.zip.*;

/**
 *
 * @author lvflores
 */
public class ByteaTest
{
    private static String url = "jdbc:postgresql://localhost/test_bytea";
    private static String user = "postgres";
    private static String password = "";

    private static final int RUNS = 35;

    private final CRC32 crcCalc = new CRC32();

    private Connection connection = null;

    private long crc32[] = new long[ RUNS ];

    /** Creates a new instance of ByteaTest */
    public ByteaTest( String pgUrl, String pgUser, String pgPass )
    throws Exception
    {
    Class.forName( "org.postgresql.Driver" );
    connection = DriverManager.getConnection( pgUrl, pgUser, pgPass );
    createTable();
    test();
    deleteTable();
    connection.close();
    }


    private void createTable()
    throws Exception
    {
    Statement stm = connection.createStatement();
    try
    {
        stm.executeUpdate( "DROP TABLE test_bytea" );
    }
    catch( Exception ex )
    {
    }
    stm.executeUpdate( "CREATE TABLE test_bytea ("
        + "id INT NOT NULL, "
        + "data BYTEA )" );
    stm.close();
    }

    private void test()
    throws Exception
    {
    // insert
    for( int i = 0; i < RUNS; ++i )
    {
        PreparedStatement pstm = connection.prepareStatement( "INSERT INTO test_bytea ( id, data ) VALUES ( ?, ? )" );
        pstm.setInt( 1, i );
        byte data[] = generateBytes( i );
//        long l = System.currentTimeMillis();
        pstm.setBytes( 2, data );
//        long l1 = System.currentTimeMillis();
        pstm.execute();
//        long l2 = System.currentTimeMillis();
        long usedMem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
//        System.out.println( "size: " + i + ".5MB setBytes: " + ( l1 -l )
//            + "ms execute: " + ( l2 -l1 ) + "ms used mem: " + ( usedMem / 1024 ) + "KB" );
        pstm.close();
        System.gc();
    }
    // select and check
    for( int i = 0; i < RUNS; ++i )
    {
        PreparedStatement pstm = connection.prepareStatement( "SELECT data FROM test_bytea WHERE ID = ?" );
        pstm.setInt( 1, i );
        long l = System.currentTimeMillis();
        ResultSet rs = pstm.executeQuery();
        rs.next();
        long l1 = System.currentTimeMillis();
        byte data[] = rs.getBytes( 1 );
        long l2 = System.currentTimeMillis();
        crcCalc.reset();
        crcCalc.update( data );
        long crc = crcCalc.getValue();
        if( crc != crc32[ i ] )
        {
        System.out.println( "WRONG DATA on idx " + i );
        }
        long usedMem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
        System.out.println( "size: " + i + ".5MB execute+next: " + ( l1 -l )
            + "ms getBytes: " + ( l2 -l1 ) + "ms used mem: " + ( usedMem / 1024 ) + "KB" );
        pstm.close();
        System.gc();
    }
    }

    private byte[] generateBytes( int idx )
    {

    byte data[] = new byte[ 1024 * 1024 * idx + 512 * 1024 ];

    for( int i = 0; i < data.length; ++i )
    {
        data[ i ] = ( byte )i;
    }

    crcCalc.reset();
    crcCalc.update( data );
    crc32[ idx ] = crcCalc.getValue();
    return data;
    }

    private void deleteTable()
    throws Exception
    {
    Statement stm = connection.createStatement();
    stm.executeUpdate( "DROP TABLE test_bytea" );
    stm.close();
    }

    public static void main( String arg[] )
    throws Exception
    {
    if( arg.length != 0 && arg.length != 3 )
    {
        System.err.println( "Usage: java com.evolute.postgresql.test.ByteaTest <URL> <USER> <PASSWORD>" );
    }
    else if( arg.length == 3 )
    {
        url = arg[ 0 ];
        user = arg[ 1 ];
        password = arg[ 2 ];
    }
    new ByteaTest( url, user, password );
    }
}
/*-------------------------------------------------------------------------
*
* Copyright (c) 2003-2005, PostgreSQL Global Development Group
*
* IDENTIFICATION
*   $PostgreSQL: pgjdbc/org/postgresql/util/PGbytea.java,v 1.12 2005/01/11 08:25:49 jurka Exp $
*
*-------------------------------------------------------------------------
*/
package org.postgresql.util;

import java.sql.*;

/**
 * Converts to and from the postgresql bytea datatype used by the backend.
 */
public class PGbytea
{
    private static final int MAX_3_BUFF_SIZE = 0;

    /*
     * Converts a PG bytea raw value (i.e. the raw binary representation
     * of the bytea data type) into a java byte[]
     */
    public static byte[] toBytes(byte[] s) throws SQLException
    {
        if (s == null)
            return null;
        final int slength = s.length;
        byte[] buf = null;
        int correctSize = slength;
        if (slength > MAX_3_BUFF_SIZE)
        {
//            long l = System.currentTimeMillis();
            // count / * //
            for (int i = 0; i < slength; ++i)
            {
                byte current = s[i];
                if (current == '\\')
                {
                    byte next = s[ ++i ];
                    if (next == '\\')
                    {
                        --correctSize;
                    }
                    else
                    {
                        correctSize -= 3;
                    }
                }
            }
//System.out.println( "TOOK: " + ( System.currentTimeMillis() - l ) + "ms - SAVED " + ( slength - correctSize ) + " ON
SECONDBUFFER" ); 
            buf = new byte[correctSize];
        }
        else
        {
            buf = new byte[slength];
        }
        int bufpos = 0;
        int thebyte;
        byte nextbyte;
        byte secondbyte;
        for (int i = 0; i < slength; i++)
        {
            nextbyte = s[i];
            if (nextbyte == (byte)'\\')
            {
                secondbyte = s[++i];
                if (secondbyte == (byte)'\\')
                {
                    //escaped \
                    buf[bufpos++] = (byte)'\\';
                }
                else
                {
                    thebyte = (secondbyte - 48) * 64 + (s[++i] - 48) * 8 + (s[++i] - 48);
                    if (thebyte > 127)
                        thebyte -= 256;
                    buf[bufpos++] = (byte)thebyte;
                }
            }
            else
            {
                buf[bufpos++] = nextbyte;
            }
        }
        if (bufpos == correctSize)
        {
//System.out.println( "SKIPPED LAST BUFFER" );
            return buf;
        }
        byte[] l_return = new byte[bufpos];
        System.arraycopy(buf, 0, l_return, 0, bufpos);
        return l_return;
    }

    /*
     * Converts a java byte[] into a PG bytea string (i.e. the text
     * representation of the bytea data type)
     */
    public static String toPGString(byte[] p_buf) throws SQLException
    {
        if (p_buf == null)
            return null;
        StringBuffer l_strbuf = new StringBuffer(2 * p_buf.length);
        for (int i = 0; i < p_buf.length; i++)
        {
            int l_int = (int)p_buf[i];
            if (l_int < 0)
            {
                l_int = 256 + l_int;
            }
            //we escape the same non-printable characters as the backend
            //we must escape all 8bit characters otherwise when convering
            //from java unicode to the db character set we may end up with
            //question marks if the character set is SQL_ASCII
            if (l_int < 040 || l_int > 0176)
            {
                //escape charcter with the form \000, but need two \\ because of
                //the parser
                l_strbuf.append("\\");
                l_strbuf.append((char)(((l_int >> 6) & 0x3) + 48));
                l_strbuf.append((char)(((l_int >> 3) & 0x7) + 48));
                l_strbuf.append((char)((l_int & 0x07) + 48));
            }
            else if (p_buf[i] == (byte)'\\')
            {
                //escape the backslash character as \\, but need four \\\\ because
                //of the parser
                l_strbuf.append("\\\\");
            }
            else
            {
                //other characters are left alone
                l_strbuf.append((char)p_buf[i]);
            }
        }
        return l_strbuf.toString();
    }
}
--- /home/lvflores/Desktop/postgresql-jdbc-8.1-407.src/org/postgresql/util/PGbytea.java    2005-01-11
08:25:49.000000000+0000 
+++ PGbytea.java    2006-08-22 23:37:53.000000000 +0100
@@ -16,8 +16,9 @@
  */
 public class PGbytea
 {
-
-    /*
+    private static final int MAX_3_BUFF_SIZE = 0;
+
+    /*
      * Converts a PG bytea raw value (i.e. the raw binary representation
      * of the bytea data type) into a java byte[]
      */
@@ -25,8 +26,36 @@
     {
         if (s == null)
             return null;
-        int slength = s.length;
-        byte[] buf = new byte[slength];
+        final int slength = s.length;
+        byte[] buf = null;
+        int correctSize = slength;
+        if (slength > MAX_3_BUFF_SIZE)
+        {
+//            long l = System.currentTimeMillis();
+            // count / * //
+            for (int i = 0; i < slength; ++i)
+            {
+                byte current = s[i];
+                if (current == '\\')
+                {
+                    byte next = s[ ++i ];
+                    if (next == '\\')
+                    {
+                        --correctSize;
+                    }
+                    else
+                    {
+                        correctSize -= 3;
+                    }
+                }
+            }
+//System.out.println( "TOOK: " + ( System.currentTimeMillis() - l ) + "ms - SAVED " + ( slength - correctSize ) + " ON
SECONDBUFFER" ); 
+            buf = new byte[correctSize];
+        }
+        else
+        {
+            buf = new byte[slength];
+        }
         int bufpos = 0;
         int thebyte;
         byte nextbyte;
@@ -55,6 +84,11 @@
                 buf[bufpos++] = nextbyte;
             }
         }
+        if (bufpos == correctSize)
+        {
+//System.out.println( "SKIPPED LAST BUFFER" );
+            return buf;
+        }
         byte[] l_return = new byte[bufpos];
         System.arraycopy(buf, 0, l_return, 0, bufpos);
         return l_return;
@@ -103,6 +137,4 @@
         }
         return l_strbuf.toString();
     }
-
-
 }

pgsql-jdbc by date:

Previous
From: David Gagnon
Date:
Subject: question regarding upgrade from pg74.215.jdbc3.jar to postgresql-8.1-407.jdbc3.jar. function does not exist appeared ??? Please help :-/
Next
From: Oliver Jowett
Date:
Subject: Re: question regarding upgrade from pg74.215.jdbc3.jar to