Home > mailing lists

Re: Unicode database + JDBC driver performance - Mailing list pgsql-general

From	Jan Ploski
Subject	Re: Unicode database + JDBC driver performance
Date	January 1, 2003 14:07:25
Msg-id	3939074.1041448037948.JavaMail.jpl@remotejava Whole thread Raw
In response to	Unicode database + JDBC driver performance (Jan Ploski <jpljpl@gmx.de>)
List	pgsql-general

Tree view

Hello,

Here is my UTF8Encoder class mentioned on pgsql-general.

It should be put into org/postgresql/core, and you will also
need to patch Encoding.java, so that it uses this class:

    if (encoding.equals("UTF-8")) {
        return UTF8Encoder.encode2(s);
    }
    else {
        return s.getBytes(encoding);
    }

There are two public utility methods in UTF8Encoder, encode1 and encode2.
They use two different approaches to determining how big the output
buffer should be. Performance-wise they seem very similiar (encode2
being a bit slower), but I favor encode2 because it does less memory
allocation and copying.

Note that I did not use any shared buffer in order to avoid
synchronization of multiple threads (as I understand, the class
Encoding must ensure thread safety itself). This may be an unnecessary
concern after all... I don't know.

UTF8Encoder can be used as is, or made into a private static inner
class of Encoding.java, whatever you prefer.

UTF8Encoder.main contains some tests to assert that it stays compatible
with Java's built-in encoder. It may be nicer to move them into
a JUnit test case, you decide.

Take care -
JPL

Attachment

UTF8Encoder.tar.gz

pgsql-general by date:

From: Bruce Momjian
Date: 01 January 2003, 13:48:32
Subject: Re: French date

From: Tom Lane
Date: 01 January 2003, 14:35:06
Subject: Re: compiling 7.3 on RH7.2

Re: Unicode database + JDBC driver performance - Mailing list pgsql-general

Attachment

Previous

Next