Thread: UNICODE and SQL
Hallo, I need to use SQL to insert some language specific characters into tables. In particular I am using German and Croatian specific characters. The database is created with UNICODE encoding. For instance, when trying to run from psql: INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); I get the following error: ERROR: Invalid UNICODE character sequence found (0xfc7220) because of 'ü' and 'ä'. How to do it? Thanks, Marco Roda
On Mon, 5 May 2003, Marco Roda wrote: > Hallo, > > I need to use SQL to insert some language specific characters into tables. > In particular I am using German and Croatian specific characters. The > database is created with UNICODE encoding. > For instance, when trying to run from psql: > > INSERT INTO test VALUES (1,'Urlaubslite fόr nδchstes Jahr'); > > I get the following error: > > ERROR: Invalid UNICODE character sequence found (0xfc7220) > > because of 'ό' and 'δ'. > > How to do it? If you want UTF-8 with psql then you must enable UTF-8 capable keyboard in X11. Now, normally in modern UNIX systems (with not so modern UTF-8 support), you often can deal with this problem by using (or writing) an application (e.g. in Java) that has Unicode support. As you are, you must know or calculate the UTF-8 representation of each german or hrvcki ISO char, take both bytes of the resulting UTF-8 char and put them in the insert statement. Not very Handy :( Also check out for some tool like PgPhpAdmin and then set your mozilla page encoding to UTF-8. P.S. Anyone knows if pgaccess supports Unicode at all??? > Thanks, > Marco Roda > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- ================================================================== Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt Nikis 4, Glyfada Athens 16610 Greece tel: +30-210-8981112 fax: +30-210-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
On Monday 05 May 2003 15:34, Marco Roda wrote: > Hallo, > > I need to use SQL to insert some language specific characters into tables. > In particular I am using German and Croatian specific characters. The > database is created with UNICODE encoding. > For instance, when trying to run from psql: > > INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); > > I get the following error: > > ERROR: Invalid UNICODE character sequence found (0xfc7220) > > because of 'ü' and 'ä'. What is your psql client encoding set to? Possibly you need to set it to LATIN1. Ian Barwick barwick@gmx.net
On Mon, 5 May 2003, Ian Barwick wrote: > On Monday 05 May 2003 15:34, Marco Roda wrote: > > Hallo, > > > > I need to use SQL to insert some language specific characters into tables. > > In particular I am using German and Croatian specific characters. The > > database is created with UNICODE encoding. > > For instance, when trying to run from psql: > > > > INSERT INTO test VALUES (1,'Urlaubslite fόr nδchstes Jahr'); > > > > I get the following error: > > > > ERROR: Invalid UNICODE character sequence found (0xfc7220) > > > > because of 'ό' and 'δ'. > > What is your psql client encoding set to? Possibly you need > to set it to LATIN1. The UTF8 version of Latin1 is Latin1 itself, but german and iso8859-2 serbocroatian are non Latin1 (high ASCII) chars. > > Ian Barwick > barwick@gmx.net > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- ================================================================== Achilleus Mantzios S/W Engineer IT dept Dynacom Tankers Mngmt Nikis 4, Glyfada Athens 16610 Greece tel: +30-210-8981112 fax: +30-210-8981877 email: achill@matrix.gatewaynet.com mantzios@softlab.ece.ntua.gr
On Monday 05 May 2003 22:24, Achilleus Mantzios wrote: > On Mon, 5 May 2003, Ian Barwick wrote: > > On Monday 05 May 2003 15:34, Marco Roda wrote: > > > Hallo, > > > > > > I need to use SQL to insert some language specific characters into > > > tables. In particular I am using German and Croatian specific > > > characters. The database is created with UNICODE encoding. > > > For instance, when trying to run from psql: > > > > > > INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); > > > > > > I get the following error: > > > > > > ERROR: Invalid UNICODE character sequence found (0xfc7220) > > > > > > because of 'ü' and 'ä'. > > > > What is your psql client encoding set to? Possibly you need > > to set it to LATIN1. or LATIN2 (?) for the Croatian characters. > The UTF8 version of Latin1 is Latin1 itself, > but german and iso8859-2 serbocroatian are > non Latin1 (high ASCII) chars. In PostgreSQL "LATIN1" is ISO 8859-1, see: http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=multibyte.html Unicode Latin 1 (characters 160-255) happens to be the same as ISO 8859-1, but in UTF-8 is represented as 2 bytes. Setting psql's \encoding to LATIN1 will convert the client's 8-bit ISO 8859-1 characters (presuming this is the case) to UTF-8. unitest=# \encoding unicode unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); ERROR: Invalid UNICODE character sequence found (0xfc7220) unitest=# \encoding latin1 unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); INSERT 19134 1 unitest=# select * from test;id | val ----+------------------------------- 1 | Urlaubslite für nächstes Jahr (1 row) unitest=# \encoding unicode unitest=# select * from test;id | val ----+------------------------------- 1 | Urlaubslite für nächstes Jahr (1 row) Ian Barwick barwick@gmx.net
That's OK! I will use the SQL variable CLIENT_ENCODING. SET CLIENT_ENCODING TO 'LATIN1'; /* for German */ or: SET CLIENT_ENCODING TO 'LATIN2'; /* for Croatian */ that is the same as psql's \encoding. Thanks a lot! Marco Roda -----Original Message----- From: Ian Barwick [mailto:barwick@gmx.net] Sent: Monday, May 05, 2003 6:49 PM To: Achilleus Mantzios Cc: Marco Roda; pgsql-sql@postgresql.org Subject: Re: [SQL] UNICODE and SQL On Monday 05 May 2003 22:24, Achilleus Mantzios wrote: > On Mon, 5 May 2003, Ian Barwick wrote: > > On Monday 05 May 2003 15:34, Marco Roda wrote: > > > Hallo, > > > > > > I need to use SQL to insert some language specific characters into > > > tables. In particular I am using German and Croatian specific > > > characters. The database is created with UNICODE encoding. > > > For instance, when trying to run from psql: > > > > > > INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); > > > > > > I get the following error: > > > > > > ERROR: Invalid UNICODE character sequence found (0xfc7220) > > > > > > because of 'ü' and 'ä'. > > > > What is your psql client encoding set to? Possibly you need > > to set it to LATIN1. or LATIN2 (?) for the Croatian characters. > The UTF8 version of Latin1 is Latin1 itself, > but german and iso8859-2 serbocroatian are > non Latin1 (high ASCII) chars. In PostgreSQL "LATIN1" is ISO 8859-1, see: http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=multibyte.ht ml Unicode Latin 1 (characters 160-255) happens to be the same as ISO 8859-1, but in UTF-8 is represented as 2 bytes. Setting psql's \encoding to LATIN1 will convert the client's 8-bit ISO 8859-1 characters (presuming this is the case) to UTF-8. unitest=# \encoding unicode unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); ERROR: Invalid UNICODE character sequence found (0xfc7220) unitest=# \encoding latin1 unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr'); INSERT 19134 1 unitest=# select * from test;id | val ----+------------------------------- 1 | Urlaubslite für nächstes Jahr (1 row) unitest=# \encoding unicode unitest=# select * from test;id | val ----+------------------------------- 1 | Urlaubslite für nächstes Jahr (1 row) Ian Barwick barwick@gmx.net