Home > mailing lists

new String(byte[]) performance - Mailing list pgsql-jdbc

From	Teofilis Martisius
Subject	new String(byte[]) performance
Date	September 12, 2002 06:50:46
Msg-id	20020911095735.GA6185@teohome.lzua.lt Whole thread Raw
List	pgsql-jdbc

Tree view

Hello,

While looking through postgresql JDBC driver sources and profiling, I
noticed that the driver uses new String(byte[]) a lot while iterating a
ResultSet. And I noticed that this String constructor takes a lot of
time. I wrote a custom byte[]->String conversion method for UTF-8 that
speeds up iterating over ResultSet 2 times or even more. I have a patch
for PostgreSQL JDBC drivers, but well, this is a workaround and I am not
sure it gets accepted. It does speed things up quite a noticable amount.

Hmm, maybe decodeUTF8() should be synchronized on cdata, or maybe cdata
should be allocated for each call. static cdata version was faster.

By the way. What should a JDBC driver do when f.e. ResultSet.getInt() is
called for a VARCHAR field? I would suggest converting byte arrays to
Strings or even to more precisely typed values (Integers, Doubles and so
on) on QueryExecutor().execute(). This should save some RAM allocation
for receiveTuple, because now memory gets allocated several times- once
for byte[], and second time for String, and third time for Integer or
other object in getObject(). Memory allocation takes a considerable
amount of time. But this stronger typing would remove some of
flexibility to any getXXX for any SQL type field. And it would probably
make the querying itself (QueryExecutor.execute() slower, i don't know
:/

Teofilis Martisius

Anyway, here is the patch to fix string decoding:

diff -r -u ./org/postgresql/core/Encoding.java
/usr/src/postgresql-7.2.2fixed/src/interfaces/jdbc/org/postgresql/core/Encoding.java
--- ./org/postgresql/core/Encoding.java    2001-11-20 00:33:37.000000000 +0200
+++ /usr/src/postgresql-7.2.2fixed/src/interfaces/jdbc/org/postgresql/core/Encoding.java    2002-09-11
15:56:10.000000000+0200 
@@ -155,6 +155,9 @@
             }
             else
             {
+                if (encoding.equals("UTF-8")) {
+                    return decodeUTF8(encodedString, offset, length);
+                }
                 return new String(encodedString, offset, length, encoding);
             }
         }
@@ -163,6 +166,43 @@
             throw new PSQLException("postgresql.stream.encoding", e);
         }
     }
+    /**
+     * custom byte[] -> String conversion routine, 3x-10x faster then standard new String(byte[])
+      */
+    static final int pow2_6 = 64;        // 2^6
+    static final int pow2_12 = 4096;    // 2^12
+    static char cdata[] = new char[50];
+
+    public static final String decodeUTF8(byte data[], int offset, int length) {
+        if (cdata.length < (length-offset)) {
+            cdata = new char[length-offset];
+        }
+        int i = offset;
+        int j = 0;
+        int z, y, x, val;
+        while (i < length) {
+            z = data[i] & 0xFF;
+            if (z < 0x80) {
+                cdata[j++] = (char)data[i];
+                i++;
+            } else if (z >= 0xE0) {        // length == 3
+                y = data[i+1] & 0xFF;
+                x = data[i+2] & 0xFF;
+                val = (z-0xE0)*pow2_12 + (y-0x80)*pow2_6 + (x-0x80);
+                cdata[j++] = (char) val;
+                i+= 3;
+            } else {        // length == 2 (maybe add checking for length > 3, throw exception if it is
+                y = data[i+1] & 0xFF;
+                val = (z - 0xC0)* (pow2_6)+(y-0x80);
+                cdata[j++] = (char) val;
+                i+=2;
+            }
+        }
+
+        String s = new String(cdata, 0, j);
+        return s;
+    }
+

     /*
      * Decode an array of bytes into a string.

pgsql-jdbc by date:

From: Vernon Wu
Date: 12 September 2002, 03:26:41
Subject: Does the JDBC driver support XADataSource interface?

From: Dave Cramer
Date: 12 September 2002, 13:03:49
Subject: Re: Speedup patch for getTables() and getIndexInfo()

new String(byte[]) performance - Mailing list pgsql-jdbc

Previous

Next