Thread: Problem while ordering Turkish chars

Problem while ordering Turkish chars

From
Devrim GUNDUZ
Date:
Hi,

I'm experiencing some problems while ordering Turkish characters.

Here is that I mean:

Let's say we have two records:

Onder <- There are two dots on O. (ASCII CODE: 153)
Ozan

In Turkish alphabet, O (dotless) comes before O (with dots). So, Ozan
should be listed before Onder, but it's listed after Onder; since
PostgreSQL thinks that those letters are the same...

To begin with:
operdb=# SELECT version();                                                version
---------------------------------------------------------------------------------------------------------PostgreSQL
7.3.2on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2  
20020903 (Red Hat Linux 8.0 3.2-7)

(installed from rpm)

Here is an simple example: Lines 1, 10,11,12 have O with dots as the first
letter.
==========SELECT oper_uzun_adi FROM operler ORDER BY oper_uzun_adi;
...
Ömer KORKMAZOner BINBASOnur BAŞOnur TURHANOsman Selçuk SARIOĞLUOsman TOPANOzan GÜLENOzan SARIOzge YAZICIOGLUÖzgür
ÇAKICIÖzgürÖZDEMİRÖzlem BAL 
...
==========

But if I apply a select query and use LIKE, PostgreSQL gives me correct
solutions:

operdb=# SELECT oper_uzun_adi FROM operler WHERE oper_uzun_adi ILIKE 'O%';             oper_uzun_adi
------------------------------------------Osman TOPANOner BINBASOzge YAZICIOGLUOnur BAŞOnur TURHANOzan GÜLENOsman
SelçukSARIOĞLUOzan SARI 
(8 rows)

Now for O with dots:
operdb=# SELECT oper_uzun_adi FROM operler WHERE oper_uzun_adi ILIKE 'Ö%';             oper_uzun_adi
------------------------------------------Özgür ÇAKICIÖzlem BALÖmer KORKMAZÖzgür ÖZDEMİR


So, LIKE understands that they are different chars; but if I simply order
them; I do not get the correct result.

Here is the locale setting:

#       Locale settings
#
# (initialized by initdb -- may be changed)
LC_MESSAGES = 'en_US'
LC_MONETARY = 'en_US'
LC_NUMERIC = 'en_US'
LC_TIME = 'en_US'

If I change it to tr_TR, nothing changes.

Could anyone help me?

Best regards,
--
Devrim GUNDUZ
devrim@gunduz.org                devrim.gunduz@linux.org.tr         http://www.gunduz.org



Re: Problem while ordering Turkish chars

From
Tom Lane
Date:
Devrim GUNDUZ <devrim@gunduz.org> writes:
> Here is the locale setting:
> LC_MESSAGES = 'en_US'
> LC_MONETARY = 'en_US'
> LC_NUMERIC = 'en_US'
> LC_TIME = 'en_US'

> If I change it to tr_TR, nothing changes. 

LC_COLLATE is what determines sort order, and you can't change it on the
fly; it's determined at initdb time.  You'll need to re-initdb in tr_TR
locale.  (See pg_controldata printout.)
        regards, tom lane