I have tried running Torsten's script, and get the same result. I have
no experience of using non-English locales.
Is there anyone who is used to operating with Unicode and locale
de_DE@utf8 who can look for the reason for this problem? COuld someone
confirm whether it is specific to the Debian package (a library problem,
perhaps?) or is universal.
Debian PostgreSQL is configured --enable-unicode-conversion
--enable-recode --enable-multibyte --enable-locale
-----Forwarded Message-----
From: Torsten Hilbrich <email@myrkr.in-berlin.de>
To: submit@bugs.debian.org
Subject: Bug#139389: Unicode problems after update to 7.2
Date: 21 Mar 2002 21:31:00 +0100
Package: postgresql
Version: 7.2-5
Hello,
I recently updated my postgresql from version 7.1.3 to 7.2-5. After
fixing small pieces due to the new character encoding checking (caused
by using a text column with latin1 contents in a unicode database)
after completed upgrading my database.
However, I just noticed that the unicode support seems to have
problems in this new version. In the old version I was able to search
case-insensitive using the ~* pattern matching. In 7.2 now even the
upper() function is not working correctly on unicode-characters. If I
call (correct environment used, see footnote [1]):
~$ psql template1
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
template1=# create database test with encoding='UNICODE';
CREATE DATABASE
template1=# \l
List of databases
Name | Owner | Encoding
-----------+----------+-----------
cd | postgres | UNICODE
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
test | postgres | UNICODE
(4 rows)
template1=# \c test
You are now connected to database test.
test=# \encoding
UNICODE
test=# select upper('ö'); -- o with diaeresis
upper
-------
ö
(1 row)
In 7.1.3 this call used to return the uppercase Ö (O with diaeresis).
I will append a small perl script (using DBI) which demonstrates this
problem too.
Torsten
Footnotes:
[1] I'm using a "xterm -u8" and LC_ALL=de_DE@utf8 for running psql.
I have similiar problems with perl and the DBI/DBD modules.
----------------- test script ------------------------------
#!/usr/bin/perl -w
use DBI;
my $dsn = "DBI:Pg:dbname=test;host=localhost";
my ($user, $password) = ('user', 'password'); # must be modified
my $dbh = DBI->connect($dsn, $user, $password,
{ PrintError => 1,
RaiseError => 1,
AutoCommit => 1 });
my $sth = $dbh->prepare("select upper(?)");
my $test = "\xc3\xb6"; # lowercase o with diaeresis in utf-8, u+00f6
$sth->execute($test);
my $result = ($sth->fetchrow_array)[0];
if($result ne "\xc3\x96") { # uppercase O with diaeresis, u+00d6
print "Result $result is wrong\n";
}
$sth->finish;
$dbh->disconnect;