Unicode database + JDBC driver performance - Mailing list pgsql-general

From Jan Ploski
Subject Unicode database + JDBC driver performance
Date
Msg-id 13799530.1040481150216.JavaMail.jpl@remotejava
Whole thread Raw
Responses Re: Unicode database + JDBC driver performance
Re: Unicode database + JDBC driver performance
List pgsql-general
Hello,

I have some questions regarding PostgreSQL handling of Unicode databases
and their performance. I am using version 7.2.1 and running two benchmarks
against a database set up with LATIN1 encoding and the same database
with UNICODE. The database consists of a single "test" table:

Column |  Type   | Modifiers
--------+---------+-----------
id     | integer | not null
txt    | text    | not null
Primary key: test_pkey

The client is written in Java, it relies on the official JDBC driver,
and is being run on the same machine as the database.

Benchmark 1:

Insert 10,000 rows (in 10 transactions, 1000 rows per transaction)
into table "test". Each row contains 674 characters, most of which
are ASCII.

Benchmark 2:

select * from test, repeated 10 times in a loop


I am measuring the disk space taken by the database in each case
(LATIN1 vs UNICODE) and the time it takes to run the benchmarks.
I don't understand the results:

Disk space change (after inserts and vacuumdb -f):
LATIN1      UNICODE
764K        640K

I would rather assume that the Unicode database takes more space,
even 2 times as more.. Apparently not (and that's nice).

Avg. Benchmark execution times (obtained with the 'time' command, repeatedly):
Benchmark 1:
LATIN1      UNICODE
11.5s       14.5s

Benchmark 2:
LATIN1      UNICODE
4.7s        8.6s

The Unicode database is slower both on INSERTs and especially on
SELECTs. I am wondering why. Since Java uses Unicode internally,
shouldn't it actually be more efficient to store/retrieve character
data in that format, with no recoding? Maybe it is an issue with the
JDBC driver? Or is handling Unicode inherently much slower on the
backend side?

Take care -
JPL


pgsql-general by date:

Previous
From: greg@turnstep.com
Date:
Subject: Re: How to backup a postgreSQL of 80 GByte ?
Next
From: Tom Lane
Date:
Subject: Re: Unicode database + JDBC driver performance