Thread: UTF-8 Problem ?

UTF-8 Problem ?

From

"Milen Kulev"

Date:

15 June 2006, 08:02:07

Hi Listers,
I want to insert some german specific characters (umlaut characters) into a table, but I am getting  the following
Error message:
postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc

Or

postgres=# EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
ERROR:  invalid UTF-8 byte sequence detected near byte 0xdf

Here are my object/statement definitions :

A) PREPARE  stmt( int, int, int, varchar) as insert  INTO  part values ($1,$2,$3,$4);

B)
postgres=# \d+ part                   Table "public.part"Column |          Type          | Modifiers | Description
--------+------------------------+-----------+-------------id1    | integer                | not null  |id2    |
integer               | not null  |id3    | integer                | not null  |filler | character varying(200) |
   | 

C)

postgres=# l\l      List of databases   Name    | Owner | Encoding
------------+-------+-----------db1        | user1 | SQL_ASCIIpostgres   | pg    | UTF8template0  | pg    |
UTF8template1 | pg    | UTF8 


How to solve my problem ?

Best Regards. Milen

Re: UTF-8 Problem ?

From

Andrew Sullivan

Date:

15 June 2006, 08:07:45

On Thu, Jun 15, 2006 at 01:01:56PM +0200, Milen Kulev wrote:

> postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc
>
> Or
>
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xdf

Sounds like your client is sending something other than UTF-8.  Is
it?

A


--
Andrew Sullivan  | ajs@crankycanuck.ca
The whole tendency of modern prose is away from concreteness.    --George Orwell

Re: UTF-8 Problem ?

From

Volkan YAZICI

Date:

15 June 2006, 08:10:40

On Jun 15 01:01, Milen Kulev wrote:
> I want to insert some german specific characters (umlaut characters)
> into a table, but I am getting  the following 
> Error message:
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc
> ...
> postgres=# l\l
>        List of databases
>     Name    | Owner | Encoding
> ------------+-------+-----------
>  db1        | user1 | SQL_ASCII
>  postgres   | pg    | UTF8
>  template0  | pg    | UTF8
>  template1  | pg    | UTF8

Did you set your client_encoding properly too? (Also, assuming that
your terminal supports the related client encoding.)


Regards.

Re: UTF-8 Problem ?

From

"Milen Kulev"

Date:

15 June 2006, 10:00:07

Hi Thomas,
What actually the compile option --enable-recode is doing ?
I haven't compiled PG with this option for sure (perhaps is the option
On by defalt ?), but oyu advice hepled me:

postgres=# \encoding
UTF8
postgres=# \encoding
UTF8
postgres=# SET client_encoding = 'LATIN1';
SET
postgres=# \encoding
LATIN1
postgres=# PREPARE  stmt( int, int, int, varchar) as insert  INTO  part values ($1,$2,$3,$4);
PREPARE
postgres=#  EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
INSERT 0 0
postgres=#  EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
INSERT 0 0
postgres=#  EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
INSERT 0 0
postgres=#

postgres=#  SELECT filler from part where filler like 'MA%' or filler like 'Gr%' ;filler
---------MAßßtabMAßßtabGrün
(3 rows)


Regards. Milen

-----Original Message-----
From: Thomas Beutin [mailto:psql@laokoon.IN-Berlin.DE]
Sent: Thursday, June 15, 2006 2:45 PM
To: pgsql-sql@postgresql.org
Cc: Milen Kulev
Subject: Re: [SQL] UTF-8 Problem ?


Hi Milen,

Milen Kulev wrote:
> Hi Listers,
> I want to insert some german specific characters (umlaut characters)
> into a table, but I am getting  the following
> Error message:
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc
>
> Or
>
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xdf
>
> Here are my object/statement definitions :
>
> A) PREPARE  stmt( int, int, int, varchar) as insert  INTO  part values
> ($1,$2,$3,$4);
>
> B)
> postgres=# \d+ part
>                     Table "public.part"
>  Column |          Type          | Modifiers | Description
> --------+------------------------+-----------+-------------
>  id1    | integer                | not null  |
>  id2    | integer                | not null  |
>  id3    | integer                | not null  |
>  filler | character varying(200) |           |
>
> C)
>
> postgres=# l\l
>        List of databases
>     Name    | Owner | Encoding
> ------------+-------+-----------
>  db1        | user1 | SQL_ASCII
>  postgres   | pg    | UTF8
>  template0  | pg    | UTF8
>  template1  | pg    | UTF8
>
>
> How to solve my problem ?

You should insert only correct utf8 strings or set the client encoding
correctly:
SET client_encoding = 'LATIN1';
or
SET client_encoding = 'LATIN9';

IIRC postgresql must be compiled with --enable-recode to support this.

Regards,
-tb

Re: UTF-8 Problem ?

From

Thomas Beutin

Date:

15 June 2006, 10:19:37

Hi Milen,

Milen Kulev wrote:
> Hi Listers,
> I want to insert some german specific characters (umlaut characters) into a table, but I am getting  the following 
> Error message:
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc
> 
> Or 
> 
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xdf
> 
> Here are my object/statement definitions :
> 
> A) PREPARE  stmt( int, int, int, varchar) as insert  INTO  part values ($1,$2,$3,$4);
> 
> B) 
> postgres=# \d+ part
>                     Table "public.part"
>  Column |          Type          | Modifiers | Description
> --------+------------------------+-----------+-------------
>  id1    | integer                | not null  |
>  id2    | integer                | not null  |
>  id3    | integer                | not null  |
>  filler | character varying(200) |           |
> 
> C) 
> 
> postgres=# l\l
>        List of databases
>     Name    | Owner | Encoding
> ------------+-------+-----------
>  db1        | user1 | SQL_ASCII
>  postgres   | pg    | UTF8
>  template0  | pg    | UTF8
>  template1  | pg    | UTF8
> 
> 
> How to solve my problem ?

You should insert only correct utf8 strings or set the client encoding
correctly:
SET client_encoding = 'LATIN1';
or
SET client_encoding = 'LATIN9';

IIRC postgresql must be compiled with --enable-recode to support this.

Regards,
-tb

Re: UTF-8 Problem ?

From

Thomas Beutin

Date:

15 June 2006, 10:34:07

Hi Milen,

Milen Kulev wrote:
> What actually the compile option --enable-recode is doing ? 
IIRC it enables the support for string recoding, but this might not be
correct anymore ...

> I haven't compiled PG with this option for sure (perhaps is the option
> On by defalt ?), but oyu advice hepled me:
[...]
You're welcome :)

Regards,
-tb

> -----Original Message-----
> From: Thomas Beutin [mailto:psql@laokoon.IN-Berlin.DE] 
> Sent: Thursday, June 15, 2006 2:45 PM
> To: pgsql-sql@postgresql.org
> Cc: Milen Kulev
> Subject: Re: [SQL] UTF-8 Problem ?
> 
> 
> Hi Milen,
> 
> Milen Kulev wrote:
>> Hi Listers,
>> I want to insert some german specific characters (umlaut characters) 
>> into a table, but I am getting  the following
>> Error message:
>> postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
>> ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc
>>
>> Or
>>
>> postgres=# EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
>> ERROR:  invalid UTF-8 byte sequence detected near byte 0xdf
>>
>> Here are my object/statement definitions :
>>
>> A) PREPARE  stmt( int, int, int, varchar) as insert  INTO  part values 
>> ($1,$2,$3,$4);
>>
>> B)
>> postgres=# \d+ part
>>                     Table "public.part"
>>  Column |          Type          | Modifiers | Description
>> --------+------------------------+-----------+-------------
>>  id1    | integer                | not null  |
>>  id2    | integer                | not null  |
>>  id3    | integer                | not null  |
>>  filler | character varying(200) |           |
>>
>> C)
>>
>> postgres=# l\l
>>        List of databases
>>     Name    | Owner | Encoding
>> ------------+-------+-----------
>>  db1        | user1 | SQL_ASCII
>>  postgres   | pg    | UTF8
>>  template0  | pg    | UTF8
>>  template1  | pg    | UTF8
>>
>>
>> How to solve my problem ?
> 
> You should insert only correct utf8 strings or set the client encoding
> correctly:
> SET client_encoding = 'LATIN1';
> or
> SET client_encoding = 'LATIN9';
> 
> IIRC postgresql must be compiled with --enable-recode to support this.
> 
> Regards,
> -tb
>

Re: UTF-8 Problem ?

From

Tom Lane

Date:

15 June 2006, 10:42:37

Thomas Beutin <psql@laokoon.IN-Berlin.DE> writes:
> Milen Kulev wrote:
>> What actually the compile option --enable-recode is doing ? 

> IIRC it enables the support for string recoding, but this might not be
> correct anymore ...

--enable-recode has been gone for a long time (a quick look shows it was
last present in 7.3), and even then it didn't have anything to do with
support for multibyte encodings like UTF8.
        regards, tom lane

Re: UTF-8 Problem ?

From

Aarni Ruuhimäki

Date:

15 June 2006, 16:45:31

Hello,

Db-encoding LATIN1 works fine for me with german, scandic, other umlauted or
accented and even cyrillic characters.

BR,

Aarni

On Thursday 15 June 2006 14:01, Milen Kulev wrote:
> Hi Listers,
> I want to insert some german specific characters (umlaut characters) into a
> table, but I am getting  the following Error message:
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'Grün')  ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xfc
>
> Or
>
> postgres=# EXECUTE  stmt (1, 1 , 1 , 'MAßßtab') ;
> ERROR:  invalid UTF-8 byte sequence detected near byte 0xdf
>
> Here are my object/statement definitions :
>
> A) PREPARE  stmt( int, int, int, varchar) as insert  INTO  part values
> ($1,$2,$3,$4);
>
> B)
> postgres=# \d+ part
>                     Table "public.part"
>  Column |          Type          | Modifiers | Description
> --------+------------------------+-----------+-------------
>  id1    | integer                | not null  |
>  id2    | integer                | not null  |
>  id3    | integer                | not null  |
>  filler | character varying(200) |           |
>
> C)
>
> postgres=# l\l
>        List of databases
>     Name    | Owner | Encoding
> ------------+-------+-----------
>  db1        | user1 | SQL_ASCII
>  postgres   | pg    | UTF8
>  template0  | pg    | UTF8
>  template1  | pg    | UTF8
>
>
> How to solve my problem ?
>
> Best Regards. Milen
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--
Aarni Ruuhimäki
Megative Tmi
Pääsintie 26
45100 Kouvola
Finland
+358-5-3755035
+358-50-4910037

www.kymi.com | cfm.kymi.com
--------------
This is a bugfree broadcast to you
from **Kmail**
on **Fedora Core** linux system
--------------
Linux is like a wigwam - no windows, no gates and a free apache inside.