Re: Case Conversion Fix for MB Chars - Mailing list pgsql-patches

From Volkan YAZICI
Subject Re: Case Conversion Fix for MB Chars
Date
Msg-id 7104a7370511280649p72f4f302p406a57ce105b0365@mail.gmail.com
Whole thread Raw
In response to Re: Case Conversion Fix for MB Chars  (Volkan YAZICI <volkan.yazici@gmail.com>)
Responses Re: Case Conversion Fix for MB Chars  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-patches
On 11/27/05, Volkan YAZICI <volkan.yazici@gmail.com> wrote:
> Tests made on an i686 with a
> 2.6.12.5 kernel. Here's a short list of cases I tried with both latin5
> and unicode charsets:
> - lower/upper functions with Turkish characters.
> - ILIKE matches with both lower and upper case Turkish characters.
> (Above testes succeeded for non-Turkish characters too.)

I read the above paragraph again and realized the out of usability of
it. Here's a modified one:

Test's made on a Debian GNU/Linux (stable) 3.1 by patching
src/backend/utils/adt/like.c (r1.62) and
src/backend/utils/adt/oracle_compat.c (r1.64) files. Related software
versions:
  - gcc-3.3 [3.3.5-13]
  - libc6-dev [2.3.2.ds1-22]
  - locales [2.3.2.ds1-22]

Tried test cases using patched CVS HEAD:

[For Latin5]
$ usr/bin/initdb -D var/data
$ LANG="tr_TR.ISO-8859-9" usr/bin/postmaster -D var/data
$ usr/bin/createdb -E latin5 test_latin5
$ usr/bin/psql test_latin5
Welcome to psql 8.2devel, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

test_latin5=# SHOW client_encoding;
 client_encoding
-----------------
 LATIN5
(1 row)

test_latin5=# SELECT upper('abcdefgğhıijklmnoöprsştuüvyz qwx 0123456789');
                   upper
-------------------------------------------
 ABCDEFGĞHIİJKLMNOÖPRSŞTUÜVYZ QWX 0123456789
(1 row)

test_latin5=# SELECT
test_latin5-# lower('ABCDEFGĞHIİJKLMNOÖPRSŞTUÜVYZ QWX 0123456789');
                    lower
---------------------------------------------
 abcdefgğhıijklmnoöprsştuüvyz qwx 0123456789
(1 row)

test_latin5=# BEGIN;
BEGIN
test_latin5=# CREATE TEMP TABLE t (v varchar);
CREATE TABLE
test_latin5=# COPY t FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> ı123
>> I123
>> i123
>> İ123
>> \.
test_latin5=# SELECT v FROM t;
  v
------
 ı123
 I123
 i123
 İ123
(4 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'ı%';
  v
------
 ı123
 I123
(2 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'I%';
  v
------
 ı123
 I123
(2 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'i%';
  v
------
 i123
 İ123
(2 rows)

test_latin5=# SELECT v FROM t WHERE v ILIKE 'İ%';
  v
------
 i123
 İ123
(2 rows)

test_latin5=# ROLLBACK;
ROLLBACK

[For UNICODE]
Same steps as above with LANG="tr_TR.UTF-8" and database/client
encoding as UNICODE.

Hope this tests help.


Regards.

pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Reduce dependancies of postmaster (without --as-needed)
Next
From: Tom Lane
Date:
Subject: Re: Reduce dependancies of postmaster (without --as-needed)