Thread: reproducible bug in I don't know what component
bug=# select * from example_objects where name = 'Модемы'; object_id | name -----------+-------- 2 | Мебель 2 | Модемы (записей: 2) bug=# select version(); version --------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 7.4.2 on i386-redhat-linux-gnu, compiled by GCC i386-redhat-linux-gcc (GCC) 3.3.3 20040216 (Red Hat Linux 3.3.3-2.1) (1 запись) Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't happen if you initdb'd with UTF-8). You need to run psql in a locale that is capable of russian letters, namely an UTF-8 locale, or a KOI8-R locale. Then: CREATE DATABASE bug WITH ENCODING='unicode'; \c bug \i dump.sql -- here you have to set client_encoding if you chose ru_RU.KOI8-R as the locale for psql -- set client_encoding to koi8r; select * from example_objects where name = 'Модемы'; dump.sql is attached, the select statement is included in UTF-8. Let me know if anything is missing. -- Markus Bertheau <twanger@bluetwanger.de>
Attachment
Am Freitag, 23. Juli 2004 11:49 schrieb Markus Bertheau: > Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't > happen if you initdb'd with UTF-8). You need to run psql in a locale > that is capable of russian letters, namely an UTF-8 locale, or a KOI8-R > locale. Then: > > CREATE DATABASE bug WITH ENCODING='unicode'; That's your problem. Your locale doesn't match your encoding. You need to use a compatible combination. -- Peter Eisentraut http://developer.postgresql.org/~petere/
=D0=92 =D0=9F=D1=82=D0=BD, 23.07.2004, =D0=B2 14:02, Peter Eisentraut =D0= =BF=D0=B8=D1=88=D0=B5=D1=82: > Am Freitag, 23. Juli 2004 11:49 schrieb Markus Bertheau: > > Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't > > happen if you initdb'd with UTF-8). You need to run psql in a locale > > that is capable of russian letters, namely an UTF-8 locale, or a KOI8-R > > locale. Then: > > > > CREATE DATABASE bug WITH ENCODING=3D'unicode'; >=20 > That's your problem. Your locale doesn't match your encoding. You need = to=20 > use a compatible combination. What is happening in the server that this is required? --=20 Markus Bertheau <twanger@bluetwanger.de>
Markus Bertheau <twanger@bluetwanger.de> writes: > Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't > happen if you initdb'd with UTF-8). If this is a bug, it's a bug in the ru_RU.KOI8-R locale definition. You can prove that the locale considers the strings equal without Postgres at all: [tgl@rh1 tgl]$ cat ru_data root root ÅÅçÅÝÅçÅ£î ÅÅÅÇÅçÅ¥î [tgl@rh1 tgl]$ sort -u ru_data root ÅÅçÅÝÅçÅ£î ÅÅÅÇÅçÅ¥î [tgl@rh1 tgl]$ LC_ALL=ru_RU.KOI8-R sort -u ru_data root ÅÅçÅÝÅçÅ£î [tgl@rh1 tgl]$ (The above is on an RHL 8.0 platform.) regards, tom lane
Am Freitag, 23. Juli 2004 15:30 schrieb Markus Bertheau: > > That's your problem. Your locale doesn't match your encoding. You need > > to use a compatible combination. > > What is happening in the server that this is required? When you ask locale-aware functions to compare strings, convert to lower-case, or what the case may be, these functions expect the strings to have a certain encoding (after all they just receive a stream of bytes, so they cannot check the encoding themselves). So if the function thinks it's comparing two KOI8-R strings and you are actually passing UTF-8 strings, the results are going to be close to comparing garbage. -- Peter Eisentraut http://developer.postgresql.org/~petere/