Thread: [Fwd: Bug#139389: Unicode problems after update to 7.2]

[Fwd: Bug#139389: Unicode problems after update to 7.2]

From
Oliver Elphick
Date:
I have tried running Torsten's script, and get the same result.  I have
no experience of using non-English locales.

Is there anyone who is used to operating with Unicode and locale
de_DE@utf8 who can look for the reason for this problem?  COuld someone
confirm whether it is specific to the Debian package (a library problem,
perhaps?) or is universal.

Debian PostgreSQL is configured --enable-unicode-conversion
--enable-recode --enable-multibyte --enable-locale

-----Forwarded Message-----

From: Torsten Hilbrich <email@myrkr.in-berlin.de>
To: submit@bugs.debian.org
Subject: Bug#139389: Unicode problems after update to 7.2
Date: 21 Mar 2002 21:31:00 +0100

Package: postgresql
Version: 7.2-5

Hello,

I recently updated my postgresql from version 7.1.3 to 7.2-5.  After
fixing small pieces due to the new character encoding checking (caused
by using a text column with latin1 contents in a unicode database)
after completed upgrading my database.

However, I just noticed that the unicode support seems to have
problems in this new version.  In the old version I was able to search
case-insensitive using the ~* pattern matching.  In 7.2 now even the
upper() function is not working correctly on unicode-characters.  If I
call (correct environment used, see footnote [1]):

~$ psql template1
Welcome to psql, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help on internal slash commands
       \g or terminate with semicolon to execute query
       \q to quit

template1=# create database test with encoding='UNICODE';
CREATE DATABASE
template1=# \l
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+-----------
 cd        | postgres | UNICODE
 template0 | postgres | SQL_ASCII
 template1 | postgres | SQL_ASCII
 test      | postgres | UNICODE
(4 rows)

template1=# \c test
You are now connected to database test.
test=# \encoding
UNICODE
test=# select upper('ö'); -- o with diaeresis
 upper
-------
 ö
(1 row)

In 7.1.3 this call used to return the uppercase Ö (O with diaeresis).

I will append a small perl script (using DBI) which demonstrates this
problem too.

        Torsten

Footnotes:
[1]  I'm using a "xterm -u8" and LC_ALL=de_DE@utf8 for running psql.
     I have similiar problems with perl and the DBI/DBD modules.

----------------- test script ------------------------------
#!/usr/bin/perl -w

use DBI;

my $dsn = "DBI:Pg:dbname=test;host=localhost";
my ($user, $password) = ('user', 'password'); # must be modified

my $dbh = DBI->connect($dsn, $user, $password,
               { PrintError => 1,
             RaiseError => 1,
             AutoCommit => 1 });

my $sth = $dbh->prepare("select upper(?)");

my $test = "\xc3\xb6"; # lowercase o with diaeresis in utf-8, u+00f6

$sth->execute($test);
my $result = ($sth->fetchrow_array)[0];

if($result ne "\xc3\x96") { # uppercase O with diaeresis, u+00d6
    print "Result $result is wrong\n";
}

$sth->finish;
$dbh->disconnect;