Thread: locale & glibc 2.2.2
Greeting! it seems like locale support with glibc 2.2.2 is completely broken... i got huge differences then performing tests in src/test/locale/koi8-r i am running PG7.1 from cvs. PS. locale itself isn't broken for sure
Pimenov Yuri wrote: > Greeting! > it seems like locale support with glibc 2.2.2 is completely broken... > i got huge differences then performing tests in src/test/locale/koi8-r > i am running PG7.1 from cvs. > PS. locale itself isn't broken for sure Locale collation and other issues were substantially changed for glibc 2.2.2. It is a glibc 2.2.2 issue, not a PostgreSQL one -- as the same codein PostgreSQLO (strcoll) is being used. We will see how things pan out with locale on glibc 2.2.2, as, like I said, there are big changes, apparently. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen <lamar.owen@wgcr.org> writes: > Pimenov Yuri wrote: > > Greeting! > > it seems like locale support with glibc 2.2.2 is completely broken... > > i got huge differences then performing tests in src/test/locale/koi8-r > > i am running PG7.1 from cvs. > > > PS. locale itself isn't broken for sure > > Locale collation and other issues were substantially changed for glibc > 2.2.2. It is a glibc 2.2.2 issue, not a PostgreSQL one -- as the same > codein PostgreSQLO (strcoll) is being used. What exactly is claimed to be broken? If this is the case sensitivity issue, that would count as a postgresql bug. -- Trond Eivind Glomsrød Red Hat, Inc.
Trond Eivind Glomsrød wrote: > > > it seems like locale support with glibc 2.2.2 is completely broken... > What exactly is claimed to be broken? If this is the case sensitivity > issue, that would count as a postgresql bug. Whatever the sentence above your question means. We'll have to wait on the reporter to elaborate what was meant by '..the locale support....broken...'. Which case-sensitivity issue? The one about table and column names? Or a different one? (sitting confused in Pisgah Forest) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen <lamar.owen@wgcr.org> writes: > Trond Eivind Glomsrød wrote: > > > Which case-sensitivity issue? The one about table and column names? Or > > > a different one? (sitting confused in Pisgah Forest) > > > I remember there were some issues about someone claiming glibc was broken > > (with LANG set to anything but C/POSIX, because it will sort this way > > Oh, ok. That goes as far back as glibc 2.1, and first reared its head > with Red Hat 6.1. While I've not tested it, I had heard that glibc > 2.2.2 'fixed' this Of course not, it's not a bug - if this is a problem, it's a bug in Postgresql: [teg@halden teg]$ cat foo2.txt Ad ae ac [teg@halden teg]$ sort foo2.txt ac Ad ae [teg@halden teg]$ > The initscript now explicitly sets locale to C/POSIX > for the initdb and the postmaster startup for the RPM, as the locale > setting can cause other problems with indexes and the LIKE > optimization I built it into our trees with a release number < 1 until I've confirmed that this doesn't break other languages. Sorting in the "C" order isn't acceptable for non-English languages as the order is wrong. -- Trond Eivind Glomsrød Red Hat, Inc.
Trond Eivind Glomsrød wrote: > > Which case-sensitivity issue? The one about table and column names? Or > > a different one? (sitting confused in Pisgah Forest) > I remember there were some issues about someone claiming glibc was broken > (with LANG set to anything but C/POSIX, because it will sort this way Oh, ok. That goes as far back as glibc 2.1, and first reared its head with Red Hat 6.1. While I've not tested it, I had heard that glibc 2.2.2 'fixed' this. The initscript now explicitly sets locale to C/POSIX for the initdb and the postmaster startup for the RPM, as the locale setting can cause other problems with indexes and the LIKE optimization (although IthinkTom made it where the backend would do the Right Thing and not optimize in the presence of a non-POSIX locale, but I don't remember the details). Time for me to revisit the issue -- after I finish getting my office and servers moved this week and next. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Thu, 19 Apr 2001, Lamar Owen wrote: > Trond Eivind Glomsrød wrote: > > > > it seems like locale support with glibc 2.2.2 is completely broken... > > What exactly is claimed to be broken? If this is the case sensitivity > > issue, that would count as a postgresql bug. > > Whatever the sentence above your question means. We'll have to wait on > the reporter to elaborate what was meant by '..the locale > support....broken...'. > > Which case-sensitivity issue? The one about table and column names? Or > a different one? (sitting confused in Pisgah Forest) I remember there were some issues about someone claiming glibc was broken (with LANG set to anything but C/POSIX, because it will sort this way ab Ac ad instead of Ac ab ad like some expected. -- Trond Eivind Glomsrød Red Hat, Inc.
teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes: > Of course not, it's not a bug - if this is a problem, it's a bug in > Postgresql: If glibc 2.2.2 sorts that way in C locale, then glibc is broken. But I assume you meant this is the behavior in some other locale. Postgres as such can cope just fine with non-C sort orders, but it seems quite possible that the koi8 regression test sample outputs were constructed using C-locale sort rules. Since the original complainant merely asserted those tests were broken without defining what he meant by broken, we're pretty much wasting our time speculating... regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes: > > Of course not, it's not a bug - if this is a problem, it's a bug in > > Postgresql: > > If glibc 2.2.2 sorts that way in C locale, then glibc is broken. > But I assume you meant this is the behavior in some other locale. [teg@halden teg]$ LC_COLLATE=C sort foo2.txt Ad ac ae [teg@halden teg]$ I agree that the above is far from ideal, but this is the traditional C way. The standard locales (used everywhere, in US en_US is used which does give you the correct order) don't have this problem, they sort correctly: [teg@halden teg]$ LC_COLLATE=en_US sort foo2.txt ac Ad ae [teg@halden teg]$ -- Trond Eivind Glomsrød Red Hat, Inc.
Tom Lane wrote: > teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=) writes: >> Of course not, it's not a bug - if this is a problem, it's a bug in >> Postgresql: > > If glibc 2.2.2 sorts that way in C locale, then glibc is broken. > But I assume you meant this is the behavior in some other locale. > > Postgres as such can cope just fine with non-C sort orders, but it > seems quite possible that the koi8 regression test sample outputs were > constructed using C-locale sort rules. Since the original complainant > merely asserted those tests were broken without defining what he meant > by broken, we're pretty much wasting our time speculating... > > regards, tom lane i'm really sorry!!! problem was because of glibc upgrade. where is no "ru_RU.KOI8-R" locale in glibc 2.2.2 and i changed it to "ru". the database was initdb'ed with former locale and thus PG was unable to use ru_RU.KOI8-R.... re-initdb just solved the problem sorry once again!