Thread: The server's LC_CTYPE locale
Hello Im got the following error when the query string was one of the Hebrew chars: SELECT upper('ש'); ERROR: invalid multibyte character for locale HINT: The server's LC_CTYPE locale is probably incompatible with the database encoding. after few minutes while gathering info i stoped getting the previous error and started to get: #SELECT lower('ש'); ERROR: invalid UTF-8 byte sequence detected near byte 0xf9 # SELECT upper('ש'); ERROR: invalid UTF-8 byte sequence detected near byte 0xf9 #SELECT version(); PostgreSQL 8.1.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.3 (Debian 4.0.3-1) #show lc_ctype ; he_IL.utf8 #SHOW SERVER_ENCODING; UTF8 Any ideas what the problem ? -- -------------------------------------------------- Michael Ben-Nes - Internet Consultant and Director. http://www.epoch.co.il - weaving the Net. Cellular: 054-4848113 --------------------------------------------------
Michael Ben-Nes <miki@canaan.co.il> writes: > Im got the following error when the query string was one of the Hebrew > chars: > SELECT upper('ש'); > ERROR: invalid multibyte character for locale > HINT: The server's LC_CTYPE locale is probably incompatible with the > database encoding. Hmph. I can't reproduce that here (using Fedora 4's version of he_IL.utf8 anyway). I assume your client_encoding was also UTF8? The troublesome character came through in your email as \327\251 (D7 A9) ... is that what you were actually entering? The reference to F9 in the other error message makes me think the character got munged somewhere in the email chain ... regards, tom lane
Tom Lane wrote: > Michael Ben-Nes <miki@canaan.co.il> writes: > >> Im got the following error when the query string was one of the Hebrew >> chars: >> > > >> SELECT upper('׳©'); >> ERROR: invalid multibyte character for locale >> HINT: The server's LC_CTYPE locale is probably incompatible with the >> database encoding. >> > > Hmph. I can't reproduce that here (using Fedora 4's version of he_IL.utf8 > anyway). I assume your client_encoding was also UTF8? The troublesome > character came through in your email as \327\251 (D7 A9) ... is that > what you were actually entering? The reference to F9 in the other error > message makes me think the character got munged somewhere in the email > chain ... > the Client Encoding is UTF8. Strangely I no longer get the second error: ERROR: invalid UTF-8 byte sequence detected near byte 0xf9 The first error returned: # SELECT lower('ש'); ERROR: invalid multibyte character for locale HINT: The server's LC_CTYPE locale is probably incompatible with the database encoding. The character that I sent is: [ש] U+05E9 ש HEBREW LETTER SHIN Im out of ideas, What else I should check ? > regards, tom lane > -- -------------------------------------------------- Michael Ben-Nes - Internet Consultant and Director. http://www.epoch.co.il - weaving the Net. Cellular: 054-4848113 --------------------------------------------------
Michael Ben-Nes <miki@canaan.co.il> writes: > The character that I sent is: > [ש] U+05E9 ש HEBREW LETTER SHIN Well, that does work out to D7 A9 in UTF8, if I'm doing the arithmetic correctly. I can't replicate any problem in either 8.1.4 or HEAD. It's possible that this is a bug that's been fixed since 8.1.3, but I don't recall any change in that area. I think more likely the difference is between the he_IL.utf8 locale definitions in Fedora 4 and Debian. Perhaps you should check for available updates to the locale. regards, tom lane
For the record: Those are the records in my locale.gen # cat /etc/locale.gen.old en_US ISO-8859-1 he_IL UTF-8 he_IL ISO-8859-8 I found out that by removing "he_IL ISO-8859-8" i fixed the problem. Why ? i have no idea ( maybe some collisions because the double he_IL ? ). Cheers Michael Ben-Nes wrote: > Hello > > > Im got the following error when the query string was one of the Hebrew > chars: > > > SELECT upper('ש'); > ERROR: invalid multibyte character for locale > HINT: The server's LC_CTYPE locale is probably incompatible with the > database encoding. > > > after few minutes while gathering info i stoped getting the previous > error and started to get: > > > #SELECT lower('ש'); > ERROR: invalid UTF-8 byte sequence detected near byte 0xf9 > > # SELECT upper('ש'); > ERROR: invalid UTF-8 byte sequence detected near byte 0xf9 > > > #SELECT version(); > PostgreSQL 8.1.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.3 > (Debian 4.0.3-1) > > > #show lc_ctype ; > he_IL.utf8 > > > #SHOW SERVER_ENCODING; > UTF8 > > Any ideas what the problem ? > > -- -------------------------------------------------- Michael Ben-Nes - Internet Consultant and Director. http://www.epoch.co.il - weaving the Net. Cellular: 054-4848113 --------------------------------------------------
On Tue, Sep 05, 2006 at 02:56:21PM +0300, Michael Ben-Nes wrote: > For the record: > > Those are the records in my locale.gen > > # cat /etc/locale.gen.old > en_US ISO-8859-1 > he_IL UTF-8 > he_IL ISO-8859-8 Yeah, that's wrong. The first column is the identifier, so the last entry should something like: he_IL.ISO-8859-8 ISO-8859-8 > Why ? i have no idea ( maybe some collisions because the double he_IL ? ). You can't do that. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.