Home > mailing lists

Postgresql JDBC UTF8 Conversion Throughput - Mailing list pgsql-jdbc

From	Paul Lindner
Subject	Postgresql JDBC UTF8 Conversion Throughput
Date	June 2, 2008 06:03:56
Msg-id	20080602085737.GA29477@inuus.com Whole thread Raw
Responses	Re: Postgresql JDBC UTF8 Conversion Throughput
List	pgsql-jdbc

Tree view

Hi,

On a heavily trafficed web site we found hundreds of threads stuck
looking up character set names.  This was traced back to the
encodeUTF8() method in the package org.postgresql.core.Utils.

It turns out the using more than two character sets in your Java
Application causes very poor throughput because of synchronization
overhead.  I wrote about this here:

  http://paul.vox.com/library/post/the-mysteries-of-java-character-set-performance.html

In a web application you can easily find yourself in this situation:
  * ISO-8859-1 is often the default character set
  * UTF-8 is used for DBs and more
  * Your web container might request 'utf-8' or other aliased
    character sets while processing web requests.
  * Web browsers sometimes request the strangest encodings.

In Java 1.6 there's an easy way to fix this charset lookup problem.
Just create a static Charset for UTF-8 and pass that to getBytes(...)
instead of the string constant "UTF-8".

   Charset UTF8_CHARSET = Charset.forName("UTF-8");
   ...
   return str.getBytes(UTF8_CHARSET);

For backwards compatibility with Java 1.4 you can use the attached
patch instead.  It uses nio classes to do the UTF-8 to byte
conversion.

You may want to consider applying this patch.  If not, at least
this message will be in the archives.

Comments/Suggestions welcome...

--
Paul Lindner        ||||| | | | |  |  |  |   |   |
lindner@inuus.com

Attachment

pgsql-jdbc by date:

From: Tom Lane
Date: 29 May 2008, 00:33:49
Subject: Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From: "Albretch Mueller"
Date: 03 June 2008, 21:04:55
Subject: How to just "link" to some data feed

Postgresql JDBC UTF8 Conversion Throughput - Mailing list pgsql-jdbc

Attachment

Previous

Next