Re: LC_COLLATE=es_MX in PgSQL 7.3.2 - Mailing list pgsql-general

From Octavio Alvarez
Subject Re: LC_COLLATE=es_MX in PgSQL 7.3.2
Date
Msg-id 1702.63.84.67.3.1055459431.squirrel@doogie.ods.org
Whole thread Raw
In response to LC_COLLATE=es_MX in PgSQL 7.3.2  ("Octavio Alvarez" <alvarezp@octavio.ods.org>)
Responses Re: LC_COLLATE=es_MX in PgSQL 7.3.2  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: LC_COLLATE=es_MX in PgSQL 7.3.2  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-general
Tom Lane said:
>> I'm using PGSQL 7.3.2 under Redhat Linux 8.0. The database was
>> initialized
>> with --lc-collate=es_MX.
>
> How about --lc-ctype?  I think that accent handling would be driven by
> LC_CTYPE not LC_COLLATE.

May be it's not the accents after all. I did the following tests without
accents.

Okay. Now, I tried several combinations, including --locale=es_MX and
--lc-collate=es_MX --lc-ctype=es_MX, and got the same result.

I would like to point out something: (still PG 7.3.2)

I tried the following with --locale=es_MX, with --locale=en_US, with
--locale=en_US.UTF-8.

alvarezp=# select * from t order by p asc, m asc;
   p   |   m
-------+-------
 octav | alvar
 OCTAV | ALVAA
 OCTAV | ALVAZ
 octia | alvra
 OCTIa | ALVAa
 OCTIb | ALVZa
 OCTIb | ALVZa
 octic | alvra
 OCTIc | ALVAa
 octvi | alvra
 OCTVI | ALVAa
 OCTVI | ALVZa
(12 rows)

No accents here. I would have expected:
   p   |   m
-------+-------
 OCTAV | ALVAA
 octav | alvar
 OCTAV | ALVAZ
 OCTIa | ALVAa
 octia | alvra
 OCTIb | ALVZa
 OCTIb | ALVZa
 OCTIc | ALVAa
 octic | alvra
 OCTVI | ALVAa
 octvi | alvra
 OCTVI | ALVZa
(12 rows)


--locale=C gives out
   p   |   m
-------+-------
 OCTAV | ALVAA
 OCTAV | ALVAZ
 OCTIa | ALVAa
 OCTIb | ALVZa
 OCTIb | ALVZa
 OCTIc | ALVAa
 OCTVI | ALVAa
 OCTVI | ALVZa
 octav | alvar
 octia | alvra
 octic | alvra
 octvi | alvra
(12 rows)

which I thnk is correct for that locale. Well, whatever.

> In any case, this is not a Postgres bug unless
> you can show that other programs using the same LC_foo settings behave
> differently.  We punt pretty much all locale-related processing to
> subroutines in libc.

How could I test that? I tried the following. Notice how the "octav"
values are correctly sorted, but I don't know if SORT is actually
separating the fields or understanding the whole line as 1 key.

[alvarezp@pgsql alvarezp]$ sort -t : < o
OCTAV:ALVAA
octav:alvar
OCTAV:ALVAZ
OCTIa:ALVAa
octia:alvra
OCTIb:ALVZa
OCTIb:ALVZa
OCTIc:ALVAa
octic:alvra
OCTVI:ALVAa
octvi:alvra
OCTVI:ALVZa

Whatever. Take a look at this one:

[alvarezp@pgsql alvarezp]$ sort -k 1,1 < o
octav alvar
OCTAV ALVAA
OCTAV ALVAZ
octia alvra
OCTIa ALVAa
OCTIb ALVZa
OCTIb ALVZa
octic alvra
OCTIc ALVAa
octvi alvra
OCTVI ALVAa
OCTVI ALVZa

I don't know if detection of which keys are equal (in this case
octav=OCTAV=OCTAV) should be made by PostgreSQL or libc. I also don't know
if I am wrong assuming octav=OCTAV. For alphabetic sorting, it should be
case insensitive.

Octavio.

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Index not being used in MAX function (7.2.3)
Next
From: Dmitry Tkach
Date:
Subject: More VACUUM output?