Thread: UTF-8 Problem ?
Hi Listers, I want to insert some german specific characters (umlaut characters) into a table, but I am getting the following Error message: postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; ERROR: invalid UTF-8 byte sequence detected near byte 0xfc Or postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; ERROR: invalid UTF-8 byte sequence detected near byte 0xdf Here are my object/statement definitions : A) PREPARE stmt( int, int, int, varchar) as insert INTO part values ($1,$2,$3,$4); B) postgres=# \d+ part Table "public.part"Column | Type | Modifiers | Description --------+------------------------+-----------+-------------id1 | integer | not null |id2 | integer | not null |id3 | integer | not null |filler | character varying(200) | | C) postgres=# l\l List of databases Name | Owner | Encoding ------------+-------+-----------db1 | user1 | SQL_ASCIIpostgres | pg | UTF8template0 | pg | UTF8template1 | pg | UTF8 How to solve my problem ? Best Regards. Milen
On Thu, Jun 15, 2006 at 01:01:56PM +0200, Milen Kulev wrote: > postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xfc > > Or > > postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xdf Sounds like your client is sending something other than UTF-8. Is it? A -- Andrew Sullivan | ajs@crankycanuck.ca The whole tendency of modern prose is away from concreteness. --George Orwell
On Jun 15 01:01, Milen Kulev wrote: > I want to insert some german specific characters (umlaut characters) > into a table, but I am getting the following > Error message: > postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xfc > ... > postgres=# l\l > List of databases > Name | Owner | Encoding > ------------+-------+----------- > db1 | user1 | SQL_ASCII > postgres | pg | UTF8 > template0 | pg | UTF8 > template1 | pg | UTF8 Did you set your client_encoding properly too? (Also, assuming that your terminal supports the related client encoding.) Regards.
Hi Thomas, What actually the compile option --enable-recode is doing ? I haven't compiled PG with this option for sure (perhaps is the option On by defalt ?), but oyu advice hepled me: postgres=# \encoding UTF8 postgres=# \encoding UTF8 postgres=# SET client_encoding = 'LATIN1'; SET postgres=# \encoding LATIN1 postgres=# PREPARE stmt( int, int, int, varchar) as insert INTO part values ($1,$2,$3,$4); PREPARE postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; INSERT 0 0 postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; INSERT 0 0 postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; INSERT 0 0 postgres=# postgres=# SELECT filler from part where filler like 'MA%' or filler like 'Gr%' ;filler ---------MAßßtabMAßßtabGrün (3 rows) Regards. Milen -----Original Message----- From: Thomas Beutin [mailto:psql@laokoon.IN-Berlin.DE] Sent: Thursday, June 15, 2006 2:45 PM To: pgsql-sql@postgresql.org Cc: Milen Kulev Subject: Re: [SQL] UTF-8 Problem ? Hi Milen, Milen Kulev wrote: > Hi Listers, > I want to insert some german specific characters (umlaut characters) > into a table, but I am getting the following > Error message: > postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xfc > > Or > > postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xdf > > Here are my object/statement definitions : > > A) PREPARE stmt( int, int, int, varchar) as insert INTO part values > ($1,$2,$3,$4); > > B) > postgres=# \d+ part > Table "public.part" > Column | Type | Modifiers | Description > --------+------------------------+-----------+------------- > id1 | integer | not null | > id2 | integer | not null | > id3 | integer | not null | > filler | character varying(200) | | > > C) > > postgres=# l\l > List of databases > Name | Owner | Encoding > ------------+-------+----------- > db1 | user1 | SQL_ASCII > postgres | pg | UTF8 > template0 | pg | UTF8 > template1 | pg | UTF8 > > > How to solve my problem ? You should insert only correct utf8 strings or set the client encoding correctly: SET client_encoding = 'LATIN1'; or SET client_encoding = 'LATIN9'; IIRC postgresql must be compiled with --enable-recode to support this. Regards, -tb
Hi Milen, Milen Kulev wrote: > Hi Listers, > I want to insert some german specific characters (umlaut characters) into a table, but I am getting the following > Error message: > postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xfc > > Or > > postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xdf > > Here are my object/statement definitions : > > A) PREPARE stmt( int, int, int, varchar) as insert INTO part values ($1,$2,$3,$4); > > B) > postgres=# \d+ part > Table "public.part" > Column | Type | Modifiers | Description > --------+------------------------+-----------+------------- > id1 | integer | not null | > id2 | integer | not null | > id3 | integer | not null | > filler | character varying(200) | | > > C) > > postgres=# l\l > List of databases > Name | Owner | Encoding > ------------+-------+----------- > db1 | user1 | SQL_ASCII > postgres | pg | UTF8 > template0 | pg | UTF8 > template1 | pg | UTF8 > > > How to solve my problem ? You should insert only correct utf8 strings or set the client encoding correctly: SET client_encoding = 'LATIN1'; or SET client_encoding = 'LATIN9'; IIRC postgresql must be compiled with --enable-recode to support this. Regards, -tb
Hi Milen, Milen Kulev wrote: > What actually the compile option --enable-recode is doing ? IIRC it enables the support for string recoding, but this might not be correct anymore ... > I haven't compiled PG with this option for sure (perhaps is the option > On by defalt ?), but oyu advice hepled me: [...] You're welcome :) Regards, -tb > -----Original Message----- > From: Thomas Beutin [mailto:psql@laokoon.IN-Berlin.DE] > Sent: Thursday, June 15, 2006 2:45 PM > To: pgsql-sql@postgresql.org > Cc: Milen Kulev > Subject: Re: [SQL] UTF-8 Problem ? > > > Hi Milen, > > Milen Kulev wrote: >> Hi Listers, >> I want to insert some german specific characters (umlaut characters) >> into a table, but I am getting the following >> Error message: >> postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; >> ERROR: invalid UTF-8 byte sequence detected near byte 0xfc >> >> Or >> >> postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; >> ERROR: invalid UTF-8 byte sequence detected near byte 0xdf >> >> Here are my object/statement definitions : >> >> A) PREPARE stmt( int, int, int, varchar) as insert INTO part values >> ($1,$2,$3,$4); >> >> B) >> postgres=# \d+ part >> Table "public.part" >> Column | Type | Modifiers | Description >> --------+------------------------+-----------+------------- >> id1 | integer | not null | >> id2 | integer | not null | >> id3 | integer | not null | >> filler | character varying(200) | | >> >> C) >> >> postgres=# l\l >> List of databases >> Name | Owner | Encoding >> ------------+-------+----------- >> db1 | user1 | SQL_ASCII >> postgres | pg | UTF8 >> template0 | pg | UTF8 >> template1 | pg | UTF8 >> >> >> How to solve my problem ? > > You should insert only correct utf8 strings or set the client encoding > correctly: > SET client_encoding = 'LATIN1'; > or > SET client_encoding = 'LATIN9'; > > IIRC postgresql must be compiled with --enable-recode to support this. > > Regards, > -tb >
Thomas Beutin <psql@laokoon.IN-Berlin.DE> writes: > Milen Kulev wrote: >> What actually the compile option --enable-recode is doing ? > IIRC it enables the support for string recoding, but this might not be > correct anymore ... --enable-recode has been gone for a long time (a quick look shows it was last present in 7.3), and even then it didn't have anything to do with support for multibyte encodings like UTF8. regards, tom lane
Hello, Db-encoding LATIN1 works fine for me with german, scandic, other umlauted or accented and even cyrillic characters. BR, Aarni On Thursday 15 June 2006 14:01, Milen Kulev wrote: > Hi Listers, > I want to insert some german specific characters (umlaut characters) into a > table, but I am getting the following Error message: > postgres=# EXECUTE stmt (1, 1 , 1 , 'Grün') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xfc > > Or > > postgres=# EXECUTE stmt (1, 1 , 1 , 'MAßßtab') ; > ERROR: invalid UTF-8 byte sequence detected near byte 0xdf > > Here are my object/statement definitions : > > A) PREPARE stmt( int, int, int, varchar) as insert INTO part values > ($1,$2,$3,$4); > > B) > postgres=# \d+ part > Table "public.part" > Column | Type | Modifiers | Description > --------+------------------------+-----------+------------- > id1 | integer | not null | > id2 | integer | not null | > id3 | integer | not null | > filler | character varying(200) | | > > C) > > postgres=# l\l > List of databases > Name | Owner | Encoding > ------------+-------+----------- > db1 | user1 | SQL_ASCII > postgres | pg | UTF8 > template0 | pg | UTF8 > template1 | pg | UTF8 > > > How to solve my problem ? > > Best Regards. Milen > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Aarni Ruuhimäki Megative Tmi Pääsintie 26 45100 Kouvola Finland +358-5-3755035 +358-50-4910037 www.kymi.com | cfm.kymi.com -------------- This is a bugfree broadcast to you from **Kmail** on **Fedora Core** linux system -------------- Linux is like a wigwam - no windows, no gates and a free apache inside.