Thread: locale & glibc 2.2.2

locale & glibc 2.2.2

From
Pimenov Yuri
Date:
Greeting!

it seems like locale support with glibc 2.2.2 is completely broken...
i got huge differences then performing tests in src/test/locale/koi8-r
i am running PG7.1 from cvs.

PS. locale itself isn't broken for sure

Re: locale & glibc 2.2.2

From
Lamar Owen
Date:
Pimenov Yuri wrote:
> Greeting!
> it seems like locale support with glibc 2.2.2 is completely broken...
> i got huge differences then performing tests in src/test/locale/koi8-r
> i am running PG7.1 from cvs.

> PS. locale itself isn't broken for sure

Locale collation and other issues were substantially changed for glibc
2.2.2. It is a glibc 2.2.2 issue, not a PostgreSQL one -- as the same
codein PostgreSQLO (strcoll) is being used. We will see how things pan
out with locale on glibc 2.2.2, as, like I said, there are big changes,
apparently.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: locale & glibc 2.2.2

From
teg@redhat.com (Trond Eivind Glomsrød)
Date:
Lamar Owen <lamar.owen@wgcr.org> writes:

> Pimenov Yuri wrote:
> > Greeting!
> > it seems like locale support with glibc 2.2.2 is completely broken...
> > i got huge differences then performing tests in src/test/locale/koi8-r
> > i am running PG7.1 from cvs.
>
> > PS. locale itself isn't broken for sure
>
> Locale collation and other issues were substantially changed for glibc
> 2.2.2. It is a glibc 2.2.2 issue, not a PostgreSQL one -- as the same
> codein PostgreSQLO (strcoll) is being used.

What exactly is claimed to be broken? If this is the case sensitivity
issue, that would count as a postgresql bug.


--
Trond Eivind Glomsrød
Red Hat, Inc.

Re: locale & glibc 2.2.2

From
Lamar Owen
Date:
Trond Eivind Glomsrød wrote:
> > > it seems like locale support with glibc 2.2.2 is completely broken...
> What exactly is claimed to be broken? If this is the case sensitivity
> issue, that would count as a postgresql bug.

Whatever the sentence above your question means.  We'll have to wait on
the reporter to elaborate what was meant by '..the locale
support....broken...'.

Which case-sensitivity issue?  The one about table and column names?  Or
a different one? (sitting confused in Pisgah Forest)
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: locale & glibc 2.2.2

From
teg@redhat.com (Trond Eivind Glomsrød)
Date:
Lamar Owen <lamar.owen@wgcr.org> writes:

> Trond Eivind Glomsrød wrote:
> > > Which case-sensitivity issue?  The one about table and column names?  Or
> > > a different one? (sitting confused in Pisgah Forest)
>
> > I remember there were some issues about someone claiming glibc was broken
> > (with LANG set to anything but C/POSIX, because it will sort this way
>
> Oh, ok.  That goes as far back as glibc 2.1, and first reared its head
> with Red Hat 6.1.  While I've not tested it, I had heard that glibc
> 2.2.2 'fixed' this

Of course not, it's not a bug - if this is a problem, it's a bug in
Postgresql:

[teg@halden teg]$ cat foo2.txt
Ad
ae
ac
[teg@halden teg]$ sort foo2.txt
ac
Ad
ae
[teg@halden teg]$

> The initscript now explicitly sets locale to C/POSIX
> for the initdb and the postmaster startup for the RPM, as the locale
> setting can cause other problems with indexes and the LIKE
> optimization

I built it into our trees with a release number < 1 until I've
confirmed that this doesn't break other languages. Sorting in the "C"
order isn't acceptable for non-English languages as the order is wrong.

--
Trond Eivind Glomsrød
Red Hat, Inc.

Re: locale & glibc 2.2.2

From
Lamar Owen
Date:
Trond Eivind Glomsrød wrote:
> > Which case-sensitivity issue?  The one about table and column names?  Or
> > a different one? (sitting confused in Pisgah Forest)

> I remember there were some issues about someone claiming glibc was broken
> (with LANG set to anything but C/POSIX, because it will sort this way

Oh, ok.  That goes as far back as glibc 2.1, and first reared its head
with Red Hat 6.1.  While I've not tested it, I had heard that glibc
2.2.2 'fixed' this. The initscript now explicitly sets locale to C/POSIX
for the initdb and the postmaster startup for the RPM, as the locale
setting can cause other problems with indexes and the LIKE optimization
(although IthinkTom made it where the backend would do the Right Thing
and not optimize in the presence of a non-POSIX locale, but I don't
remember the details).  Time for me to revisit the issue -- after I
finish getting my office and servers moved this week and next.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: locale & glibc 2.2.2

From
Trond Eivind Glomsrød
Date:
On Thu, 19 Apr 2001, Lamar Owen wrote:

> Trond Eivind Glomsrød wrote:
> > > > it seems like locale support with glibc 2.2.2 is completely broken...
> > What exactly is claimed to be broken? If this is the case sensitivity
> > issue, that would count as a postgresql bug.
>
> Whatever the sentence above your question means.  We'll have to wait on
> the reporter to elaborate what was meant by '..the locale
> support....broken...'.
>
> Which case-sensitivity issue?  The one about table and column names?  Or
> a different one? (sitting confused in Pisgah Forest)

I remember there were some issues about someone claiming glibc was broken
(with LANG set to anything but C/POSIX, because it will sort this way

ab
Ac
ad

instead of

Ac
ab
ad

like some expected.
--
Trond Eivind Glomsrød
Red Hat, Inc.


Re: locale & glibc 2.2.2

From
Tom Lane
Date:
teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes:
> Of course not, it's not a bug - if this is a problem, it's a bug in
> Postgresql:

If glibc 2.2.2 sorts that way in C locale, then glibc is broken.
But I assume you meant this is the behavior in some other locale.

Postgres as such can cope just fine with non-C sort orders, but it
seems quite possible that the koi8 regression test sample outputs were
constructed using C-locale sort rules.  Since the original complainant
merely asserted those tests were broken without defining what he meant
by broken, we're pretty much wasting our time speculating...

            regards, tom lane

Re: locale & glibc 2.2.2

From
teg@redhat.com (Trond Eivind Glomsrød)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes:
> > Of course not, it's not a bug - if this is a problem, it's a bug in
> > Postgresql:
>
> If glibc 2.2.2 sorts that way in C locale, then glibc is broken.
> But I assume you meant this is the behavior in some other locale.

[teg@halden teg]$ LC_COLLATE=C sort foo2.txt
Ad
ac
ae
[teg@halden teg]$

I agree that the above is far from ideal, but this is the traditional
C way. The standard locales (used everywhere, in US en_US is used
which does give you the correct order) don't have this problem, they
sort correctly:

[teg@halden teg]$ LC_COLLATE=en_US sort foo2.txt
ac
Ad
ae
[teg@halden teg]$

--
Trond Eivind Glomsrød
Red Hat, Inc.

Re: locale & glibc 2.2.2

From
Pimenov Yuri
Date:
Tom Lane wrote:

> teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes:
>> Of course not, it's not a bug - if this is a problem, it's a bug in
>> Postgresql:
>
> If glibc 2.2.2 sorts that way in C locale, then glibc is broken.
> But I assume you meant this is the behavior in some other locale.
>
> Postgres as such can cope just fine with non-C sort orders, but it
> seems quite possible that the koi8 regression test sample outputs were
> constructed using C-locale sort rules.  Since the original complainant
> merely asserted those tests were broken without defining what he meant
> by broken, we're pretty much wasting our time speculating...
>
> regards, tom lane

i'm really sorry!!!

problem was because of glibc upgrade.
where is no "ru_RU.KOI8-R" locale in glibc 2.2.2 and i changed it to "ru".
the database was initdb'ed with former locale and thus PG was unable to use
ru_RU.KOI8-R....

re-initdb just solved the problem

sorry once again!