Re: Yet another problem with ILIKE and UTF-8 - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Yet another problem with ILIKE and UTF-8
Date
Msg-id 17636.1193330108@sss.pgh.pa.us
Whole thread Raw
In response to Re: Yet another problem with ILIKE and UTF-8  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-bugs
Gregory Stark <stark@enterprisedb.com> writes:
> "Gergely Bor" <borg42@gmail.com> writes:
>> Environment B: Debian lenny/sid ^[1], kernel version 2.6.20.1, glibc
>> 2.6.1-5, psql 8.2.5, lc_* is hu_HU, all encondings (client, server,
>> DB) are UTF-8.

> I'm not sure this is the right answer but what happens if you initdb a
> database on the Debian box with lc_* set to hu_HU.UTF-8 ?

On my Fedora Core 6 machine, the encoding implied by LANG=hu_HU
seems to be LATIN2, not UTF8.  It's possible that Debian's glibc
does this differently than Fedora's, but not real likely.
So I think Greg has probably identified the problem correctly:
you have a locale-vs-encoding mismatch on the Debian setup.

FWIW, 8.3 will reject this sort of misconfiguration:

$ LANG=hu_HU initdb -E utf8
The files belonging to this database system will be owned by user "tgl".
This user must also own the server process.

The database cluster will be initialized with locale hu_HU.
initdb: encoding mismatch
The encoding you selected (UTF8) and the encoding that the
selected locale uses (LATIN2) do not match.  This would lead to
misbehavior in various character string processing functions.
Rerun initdb and either do not specify an encoding explicitly,
or choose a matching combination.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Yet another problem with ILIKE and UTF-8
Next
From: Tom Lane
Date:
Subject: Re: BUG #3697: utf8 issue: can not reimport a table that was successfully exported.