Re: bytea memory improvement - Mailing list pgsql-jdbc
| From | Luis Vilar Flores |
|---|---|
| Subject | Re: bytea memory improvement |
| Date | |
| Msg-id | 44EB8C82.1000203@evolute.pt Whole thread Raw |
| In response to | bytea memory improvement (Luis Vilar Flores <lflores@evolute.pt>) |
| Responses |
Re: bytea memory improvement
Re: bytea memory improvement |
| List | pgsql-jdbc |
Hello,
This time I believe to have all tests and source needed to have the
patch accepted.
To all that already forgot the first emails, I developed an
modified version of the method toBytes from the
org.postgresql.util.PGbytea class.
The old method uses 3 buffers to translate the data from the nework
to the client, this uses too much memory.
My method only uses 2 buffers, but does one more pass through the
original buffer (to calculate it's final size).
Bellow is a table with times and memory usage of the 2 methods,
using the supplied ByteaTest class:
OLD method:
size: 0.5MB execute+next: 49ms getBytes: 18ms used mem: 74505KB
size: 1.5MB execute+next: 94ms getBytes: 53ms used mem: 48004KB
size: 2.5MB execute+next: 147ms getBytes: 110ms used mem: 23537KB
size: 3.5MB execute+next: 244ms getBytes: 190ms used mem: 24504KB
size: 4.5MB execute+next: 306ms getBytes: 224ms used mem: 31448KB
size: 5.5MB execute+next: 364ms getBytes: 267ms used mem: 38392KB
size: 6.5MB execute+next: 413ms getBytes: 308ms used mem: 45336KB
size: 7.5MB execute+next: 464ms getBytes: 306ms used mem: 52281KB
size: 8.5MB execute+next: 511ms getBytes: 349ms used mem: 59225KB
size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB
size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB
size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB
size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB
size: 13.5MB execute+next: 820ms getBytes: 514ms used mem: 93945KB
size: 14.5MB execute+next: 865ms getBytes: 734ms used mem: 100888KB
size: 15.5MB execute+next: 921ms getBytes: 586ms used mem: 107833KB
size: 16.5MB execute+next: 1003ms getBytes: 619ms used mem: 114777KB
size: 17.5MB execute+next: 1030ms getBytes: 652ms used mem: 121721KB
size: 18.5MB execute+next: 1102ms getBytes: 927ms used mem: 128664KB
size: 19.5MB execute+next: 1166ms getBytes: 723ms used mem: 135609KB
size: 20.5MB execute+next: 1217ms getBytes: 735ms used mem: 142583KB
size: 21.5MB execute+next: 1284ms getBytes: 766ms used mem: 149527KB
size: 22.5MB execute+next: 1437ms getBytes: 801ms used mem: 156471KB
size: 23.5MB execute+next: 1425ms getBytes: 833ms used mem: 163415KB
size: 24.5MB execute+next: 1453ms getBytes: 866ms used mem: 170359KB
size: 25.5MB execute+next: 1766ms getBytes: 902ms used mem: 177303KB
size: 26.5MB execute+next: 2004ms getBytes: 939ms used mem: 184247KB
size: 27.5MB execute+next: 1650ms getBytes: 968ms used mem: 191191KB
size: 28.5MB execute+next: 1757ms getBytes: 796ms used mem: 198105KB
size: 29.5MB execute+next: 1770ms getBytes: 1040ms used mem: 205086KB
size: 30.5MB execute+next: 1820ms getBytes: 1074ms used mem: 212030KB
size: 31.5MB execute+next: 1869ms getBytes: 1109ms used mem: 218974KB
size: 32.5MB execute+next: 1930ms getBytes: 1146ms used mem: 225918KB
size: 33.5MB execute+next: 2183ms getBytes: 1177ms used mem: 232862KB
size: 34.5MB execute+next: 2241ms getBytes: 1221ms used mem: 239806KB
NEW method:
size: 0.5MB execute+next: 50ms getBytes: 19ms used mem: 73137KB
size: 1.5MB execute+next: 90ms getBytes: 50ms used mem: 43760KB
size: 2.5MB execute+next: 149ms getBytes: 97ms used mem: 16136KB
size: 3.5MB execute+next: 237ms getBytes: 113ms used mem: 14170KB
size: 4.5MB execute+next: 302ms getBytes: 174ms used mem: 18127KB
size: 5.5MB execute+next: 357ms getBytes: 234ms used mem: 22110KB
size: 6.5MB execute+next: 602ms getBytes: 232ms used mem: 26095KB
size: 7.5MB execute+next: 477ms getBytes: 265ms used mem: 30079KB
size: 8.5MB execute+next: 532ms getBytes: 296ms used mem: 34063KB
size: 9.5MB execute+next: 590ms getBytes: 385ms used mem: 38046KB
size: 10.5MB execute+next: 648ms getBytes: 357ms used mem: 42031KB
size: 11.5MB execute+next: 695ms getBytes: 391ms used mem: 46015KB
size: 12.5MB execute+next: 765ms getBytes: 423ms used mem: 49999KB
size: 13.5MB execute+next: 825ms getBytes: 542ms used mem: 53982KB
size: 14.5MB execute+next: 874ms getBytes: 491ms used mem: 57967KB
size: 15.5MB execute+next: 931ms getBytes: 521ms used mem: 61951KB
size: 16.5MB execute+next: 992ms getBytes: 551ms used mem: 65935KB
size: 17.5MB execute+next: 1063ms getBytes: 694ms used mem: 69918KB
size: 18.5MB execute+next: 1111ms getBytes: 618ms used mem: 73903KB
size: 19.5MB execute+next: 1168ms getBytes: 649ms used mem: 77887KB
size: 20.5MB execute+next: 1230ms getBytes: 654ms used mem: 81903KB
size: 21.5MB execute+next: 1289ms getBytes: 687ms used mem: 85890KB
size: 22.5MB execute+next: 1345ms getBytes: 737ms used mem: 89875KB
size: 23.5MB execute+next: 1415ms getBytes: 751ms used mem: 93861KB
size: 24.5MB execute+next: 1461ms getBytes: 782ms used mem: 97846KB
size: 25.5MB execute+next: 1521ms getBytes: 817ms used mem: 101833KB
size: 26.5MB execute+next: 1587ms getBytes: 848ms used mem: 105817KB
size: 27.5MB execute+next: 1634ms getBytes: 877ms used mem: 109804KB
size: 28.5MB execute+next: 1692ms getBytes: 931ms used mem: 113789KB
size: 29.5MB execute+next: 1748ms getBytes: 944ms used mem: 117775KB
size: 30.5MB execute+next: 1820ms getBytes: 972ms used mem: 121760KB
size: 31.5MB execute+next: 1869ms getBytes: 1005ms used mem: 125747KB
size: 32.5MB execute+next: 1915ms getBytes: 1038ms used mem: 129731KB
size: 33.5MB execute+next: 1983ms getBytes: 1088ms used mem: 133718KB
size: 34.5MB execute+next: 2055ms getBytes: 1103ms used mem: 137703KB
As you can see the execution time remained almost the same (small
gain on the new version), but memory usage is drastically improved.
These times were obtained in a Celeron M 1.6GHz laptop with 1GB RAM,
running Fedora Core 5, Java 1.5.0_08 and Postgresql 8.1.4.
In attach I supply the modified PGbytea.java, the patch versus
8.1-407 source (the 8.2dev-503 is the same), and the test program
ByteaTest.java.
The test program also validates the correctness of the result
through CRC32.
Hope to hear some feedback soon, hope I didn't forget anything ...
Luis Flores
/*
* ByteaTest.java
*
* Created on August 22, 2006, 8:09 PM
*
* To change this template, choose Tools | Template Manager
* and open the template in the editor.
*/
package com.evolute.postgresql.test;
import java.sql.*;
import java.util.zip.*;
/**
*
* @author lvflores
*/
public class ByteaTest
{
private static String url = "jdbc:postgresql://localhost/test_bytea";
private static String user = "postgres";
private static String password = "";
private static final int RUNS = 35;
private final CRC32 crcCalc = new CRC32();
private Connection connection = null;
private long crc32[] = new long[ RUNS ];
/** Creates a new instance of ByteaTest */
public ByteaTest( String pgUrl, String pgUser, String pgPass )
throws Exception
{
Class.forName( "org.postgresql.Driver" );
connection = DriverManager.getConnection( pgUrl, pgUser, pgPass );
createTable();
test();
deleteTable();
connection.close();
}
private void createTable()
throws Exception
{
Statement stm = connection.createStatement();
try
{
stm.executeUpdate( "DROP TABLE test_bytea" );
}
catch( Exception ex )
{
}
stm.executeUpdate( "CREATE TABLE test_bytea ("
+ "id INT NOT NULL, "
+ "data BYTEA )" );
stm.close();
}
private void test()
throws Exception
{
// insert
for( int i = 0; i < RUNS; ++i )
{
PreparedStatement pstm = connection.prepareStatement( "INSERT INTO test_bytea ( id, data ) VALUES ( ?, ? )" );
pstm.setInt( 1, i );
byte data[] = generateBytes( i );
// long l = System.currentTimeMillis();
pstm.setBytes( 2, data );
// long l1 = System.currentTimeMillis();
pstm.execute();
// long l2 = System.currentTimeMillis();
long usedMem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
// System.out.println( "size: " + i + ".5MB setBytes: " + ( l1 -l )
// + "ms execute: " + ( l2 -l1 ) + "ms used mem: " + ( usedMem / 1024 ) + "KB" );
pstm.close();
System.gc();
}
// select and check
for( int i = 0; i < RUNS; ++i )
{
PreparedStatement pstm = connection.prepareStatement( "SELECT data FROM test_bytea WHERE ID = ?" );
pstm.setInt( 1, i );
long l = System.currentTimeMillis();
ResultSet rs = pstm.executeQuery();
rs.next();
long l1 = System.currentTimeMillis();
byte data[] = rs.getBytes( 1 );
long l2 = System.currentTimeMillis();
crcCalc.reset();
crcCalc.update( data );
long crc = crcCalc.getValue();
if( crc != crc32[ i ] )
{
System.out.println( "WRONG DATA on idx " + i );
}
long usedMem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
System.out.println( "size: " + i + ".5MB execute+next: " + ( l1 -l )
+ "ms getBytes: " + ( l2 -l1 ) + "ms used mem: " + ( usedMem / 1024 ) + "KB" );
pstm.close();
System.gc();
}
}
private byte[] generateBytes( int idx )
{
byte data[] = new byte[ 1024 * 1024 * idx + 512 * 1024 ];
for( int i = 0; i < data.length; ++i )
{
data[ i ] = ( byte )i;
}
crcCalc.reset();
crcCalc.update( data );
crc32[ idx ] = crcCalc.getValue();
return data;
}
private void deleteTable()
throws Exception
{
Statement stm = connection.createStatement();
stm.executeUpdate( "DROP TABLE test_bytea" );
stm.close();
}
public static void main( String arg[] )
throws Exception
{
if( arg.length != 0 && arg.length != 3 )
{
System.err.println( "Usage: java com.evolute.postgresql.test.ByteaTest <URL> <USER> <PASSWORD>" );
}
else if( arg.length == 3 )
{
url = arg[ 0 ];
user = arg[ 1 ];
password = arg[ 2 ];
}
new ByteaTest( url, user, password );
}
}
/*-------------------------------------------------------------------------
*
* Copyright (c) 2003-2005, PostgreSQL Global Development Group
*
* IDENTIFICATION
* $PostgreSQL: pgjdbc/org/postgresql/util/PGbytea.java,v 1.12 2005/01/11 08:25:49 jurka Exp $
*
*-------------------------------------------------------------------------
*/
package org.postgresql.util;
import java.sql.*;
/**
* Converts to and from the postgresql bytea datatype used by the backend.
*/
public class PGbytea
{
private static final int MAX_3_BUFF_SIZE = 0;
/*
* Converts a PG bytea raw value (i.e. the raw binary representation
* of the bytea data type) into a java byte[]
*/
public static byte[] toBytes(byte[] s) throws SQLException
{
if (s == null)
return null;
final int slength = s.length;
byte[] buf = null;
int correctSize = slength;
if (slength > MAX_3_BUFF_SIZE)
{
// long l = System.currentTimeMillis();
// count / * //
for (int i = 0; i < slength; ++i)
{
byte current = s[i];
if (current == '\\')
{
byte next = s[ ++i ];
if (next == '\\')
{
--correctSize;
}
else
{
correctSize -= 3;
}
}
}
//System.out.println( "TOOK: " + ( System.currentTimeMillis() - l ) + "ms - SAVED " + ( slength - correctSize ) + " ON
SECONDBUFFER" );
buf = new byte[correctSize];
}
else
{
buf = new byte[slength];
}
int bufpos = 0;
int thebyte;
byte nextbyte;
byte secondbyte;
for (int i = 0; i < slength; i++)
{
nextbyte = s[i];
if (nextbyte == (byte)'\\')
{
secondbyte = s[++i];
if (secondbyte == (byte)'\\')
{
//escaped \
buf[bufpos++] = (byte)'\\';
}
else
{
thebyte = (secondbyte - 48) * 64 + (s[++i] - 48) * 8 + (s[++i] - 48);
if (thebyte > 127)
thebyte -= 256;
buf[bufpos++] = (byte)thebyte;
}
}
else
{
buf[bufpos++] = nextbyte;
}
}
if (bufpos == correctSize)
{
//System.out.println( "SKIPPED LAST BUFFER" );
return buf;
}
byte[] l_return = new byte[bufpos];
System.arraycopy(buf, 0, l_return, 0, bufpos);
return l_return;
}
/*
* Converts a java byte[] into a PG bytea string (i.e. the text
* representation of the bytea data type)
*/
public static String toPGString(byte[] p_buf) throws SQLException
{
if (p_buf == null)
return null;
StringBuffer l_strbuf = new StringBuffer(2 * p_buf.length);
for (int i = 0; i < p_buf.length; i++)
{
int l_int = (int)p_buf[i];
if (l_int < 0)
{
l_int = 256 + l_int;
}
//we escape the same non-printable characters as the backend
//we must escape all 8bit characters otherwise when convering
//from java unicode to the db character set we may end up with
//question marks if the character set is SQL_ASCII
if (l_int < 040 || l_int > 0176)
{
//escape charcter with the form \000, but need two \\ because of
//the parser
l_strbuf.append("\\");
l_strbuf.append((char)(((l_int >> 6) & 0x3) + 48));
l_strbuf.append((char)(((l_int >> 3) & 0x7) + 48));
l_strbuf.append((char)((l_int & 0x07) + 48));
}
else if (p_buf[i] == (byte)'\\')
{
//escape the backslash character as \\, but need four \\\\ because
//of the parser
l_strbuf.append("\\\\");
}
else
{
//other characters are left alone
l_strbuf.append((char)p_buf[i]);
}
}
return l_strbuf.toString();
}
}
--- /home/lvflores/Desktop/postgresql-jdbc-8.1-407.src/org/postgresql/util/PGbytea.java 2005-01-11
08:25:49.000000000+0000
+++ PGbytea.java 2006-08-22 23:37:53.000000000 +0100
@@ -16,8 +16,9 @@
*/
public class PGbytea
{
-
- /*
+ private static final int MAX_3_BUFF_SIZE = 0;
+
+ /*
* Converts a PG bytea raw value (i.e. the raw binary representation
* of the bytea data type) into a java byte[]
*/
@@ -25,8 +26,36 @@
{
if (s == null)
return null;
- int slength = s.length;
- byte[] buf = new byte[slength];
+ final int slength = s.length;
+ byte[] buf = null;
+ int correctSize = slength;
+ if (slength > MAX_3_BUFF_SIZE)
+ {
+// long l = System.currentTimeMillis();
+ // count / * //
+ for (int i = 0; i < slength; ++i)
+ {
+ byte current = s[i];
+ if (current == '\\')
+ {
+ byte next = s[ ++i ];
+ if (next == '\\')
+ {
+ --correctSize;
+ }
+ else
+ {
+ correctSize -= 3;
+ }
+ }
+ }
+//System.out.println( "TOOK: " + ( System.currentTimeMillis() - l ) + "ms - SAVED " + ( slength - correctSize ) + " ON
SECONDBUFFER" );
+ buf = new byte[correctSize];
+ }
+ else
+ {
+ buf = new byte[slength];
+ }
int bufpos = 0;
int thebyte;
byte nextbyte;
@@ -55,6 +84,11 @@
buf[bufpos++] = nextbyte;
}
}
+ if (bufpos == correctSize)
+ {
+//System.out.println( "SKIPPED LAST BUFFER" );
+ return buf;
+ }
byte[] l_return = new byte[bufpos];
System.arraycopy(buf, 0, l_return, 0, bufpos);
return l_return;
@@ -103,6 +137,4 @@
}
return l_strbuf.toString();
}
-
-
}
pgsql-jdbc by date: