Thread: UNICODE and SQL

UNICODE and SQL

From
"Marco Roda"
Date:
Hallo,

I need to use SQL to insert some language specific characters into tables.
In particular I am using German and Croatian specific characters. The
database is created with UNICODE encoding.
For instance, when trying to run from psql:

INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');

I get the following error:

ERROR:  Invalid UNICODE character sequence found (0xfc7220)

because of 'ü' and 'ä'.

How to do it?
Thanks,
Marco Roda



Re: UNICODE and SQL

From
Achilleus Mantzios
Date:
On Mon, 5 May 2003, Marco Roda wrote:

> Hallo,
>
> I need to use SQL to insert some language specific characters into tables.
> In particular I am using German and Croatian specific characters. The
> database is created with UNICODE encoding.
> For instance, when trying to run from psql:
>
> INSERT INTO test VALUES (1,'Urlaubslite fόr nδchstes Jahr');
>
> I get the following error:
>
> ERROR:  Invalid UNICODE character sequence found (0xfc7220)
>
> because of 'ό' and 'δ'.
>
> How to do it?

If you want UTF-8 with psql
then you must enable UTF-8 capable keyboard in X11.

Now, normally in modern UNIX systems (with not so modern
UTF-8 support),
you often can deal with this problem by using
(or writing) an application (e.g. in Java) that
has Unicode support.

As you are, you must
know or calculate the UTF-8 representation
of each german or hrvcki ISO char, take
both bytes of the resulting UTF-8 char
and put them in the insert statement.

Not very Handy :(

Also check out for some tool like
PgPhpAdmin and then set your mozilla
page encoding to UTF-8.

P.S.
Anyone knows if pgaccess supports Unicode at all???

> Thanks,
> Marco Roda
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com       mantzios@softlab.ece.ntua.gr



Re: UNICODE and SQL

From
Ian Barwick
Date:
On Monday 05 May 2003 15:34, Marco Roda wrote:
> Hallo,
>
> I need to use SQL to insert some language specific characters into tables.
> In particular I am using German and Croatian specific characters. The
> database is created with UNICODE encoding.
> For instance, when trying to run from psql:
>
> INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
>
> I get the following error:
>
> ERROR:  Invalid UNICODE character sequence found (0xfc7220)
>
> because of 'ü' and 'ä'.

What is your psql client encoding set to? Possibly you need
to set it to LATIN1.

Ian Barwick
barwick@gmx.net



Re: UNICODE and SQL

From
Achilleus Mantzios
Date:
On Mon, 5 May 2003, Ian Barwick wrote:

> On Monday 05 May 2003 15:34, Marco Roda wrote:
> > Hallo,
> >
> > I need to use SQL to insert some language specific characters into tables.
> > In particular I am using German and Croatian specific characters. The
> > database is created with UNICODE encoding.
> > For instance, when trying to run from psql:
> >
> > INSERT INTO test VALUES (1,'Urlaubslite fόr nδchstes Jahr');
> >
> > I get the following error:
> >
> > ERROR:  Invalid UNICODE character sequence found (0xfc7220)
> >
> > because of 'ό' and 'δ'.
>
> What is your psql client encoding set to? Possibly you need
> to set it to LATIN1.

The UTF8 version of Latin1 is Latin1 itself,
but german and iso8859-2 serbocroatian are
non Latin1 (high ASCII) chars.

>
> Ian Barwick
> barwick@gmx.net
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:    +30-210-8981112
fax:    +30-210-8981877
email:  achill@matrix.gatewaynet.com       mantzios@softlab.ece.ntua.gr



Re: UNICODE and SQL

From
Ian Barwick
Date:
On Monday 05 May 2003 22:24, Achilleus Mantzios wrote:
> On Mon, 5 May 2003, Ian Barwick wrote:
> > On Monday 05 May 2003 15:34, Marco Roda wrote:
> > > Hallo,
> > >
> > > I need to use SQL to insert some language specific characters into
> > > tables. In particular I am using German and Croatian specific
> > > characters. The database is created with UNICODE encoding.
> > > For instance, when trying to run from psql:
> > >
> > > INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
> > >
> > > I get the following error:
> > >
> > > ERROR:  Invalid UNICODE character sequence found (0xfc7220)
> > >
> > > because of 'ü' and 'ä'.
> >
> > What is your psql client encoding set to? Possibly you need
> > to set it to LATIN1.

or LATIN2 (?) for the Croatian characters.

> The UTF8 version of Latin1 is Latin1 itself,
> but german and iso8859-2 serbocroatian are
> non Latin1 (high ASCII) chars.

In PostgreSQL "LATIN1" is ISO 8859-1, see:
http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=multibyte.html
Unicode Latin 1 (characters 160-255) happens to be the same as ISO 8859-1,
but in UTF-8 is represented as 2 bytes.

Setting psql's \encoding to LATIN1 will convert the client's 8-bit ISO 8859-1
characters (presuming this is the case) to UTF-8.

unitest=# \encoding unicode
unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
ERROR:  Invalid UNICODE character sequence found (0xfc7220)
unitest=# \encoding latin1
unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
INSERT 19134 1
unitest=# select * from test;id |              val
----+------------------------------- 1 | Urlaubslite für nächstes Jahr
(1 row)
unitest=# \encoding unicode
unitest=# select * from test;id |              val
----+------------------------------- 1 | Urlaubslite für nächstes Jahr
(1 row)


Ian Barwick
barwick@gmx.net



Re: UNICODE and SQL

From
"Marco Roda"
Date:
That's OK!
I will use the SQL variable CLIENT_ENCODING.

SET CLIENT_ENCODING TO 'LATIN1';    /* for German   */
or:
SET CLIENT_ENCODING TO 'LATIN2';    /* for Croatian */

that is the same as psql's \encoding.

Thanks a lot!
Marco Roda

-----Original Message-----
From: Ian Barwick [mailto:barwick@gmx.net]
Sent: Monday, May 05, 2003 6:49 PM
To: Achilleus Mantzios
Cc: Marco Roda; pgsql-sql@postgresql.org
Subject: Re: [SQL] UNICODE and SQL


On Monday 05 May 2003 22:24, Achilleus Mantzios wrote:
> On Mon, 5 May 2003, Ian Barwick wrote:
> > On Monday 05 May 2003 15:34, Marco Roda wrote:
> > > Hallo,
> > >
> > > I need to use SQL to insert some language specific characters into
> > > tables. In particular I am using German and Croatian specific
> > > characters. The database is created with UNICODE encoding.
> > > For instance, when trying to run from psql:
> > >
> > > INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
> > >
> > > I get the following error:
> > >
> > > ERROR:  Invalid UNICODE character sequence found (0xfc7220)
> > >
> > > because of 'ü' and 'ä'.
> >
> > What is your psql client encoding set to? Possibly you need
> > to set it to LATIN1.

or LATIN2 (?) for the Croatian characters.

> The UTF8 version of Latin1 is Latin1 itself,
> but german and iso8859-2 serbocroatian are
> non Latin1 (high ASCII) chars.

In PostgreSQL "LATIN1" is ISO 8859-1, see:
http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=multibyte.ht
ml
Unicode Latin 1 (characters 160-255) happens to be the same as ISO 8859-1,
but in UTF-8 is represented as 2 bytes.

Setting psql's \encoding to LATIN1 will convert the client's 8-bit ISO
8859-1
characters (presuming this is the case) to UTF-8.

unitest=# \encoding unicode
unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
ERROR:  Invalid UNICODE character sequence found (0xfc7220)
unitest=# \encoding latin1
unitest=# INSERT INTO test VALUES (1,'Urlaubslite für nächstes Jahr');
INSERT 19134 1
unitest=# select * from test;id |              val
----+------------------------------- 1 | Urlaubslite für nächstes Jahr
(1 row)
unitest=# \encoding unicode
unitest=# select * from test;id |              val
----+------------------------------- 1 | Urlaubslite für nächstes Jahr
(1 row)


Ian Barwick
barwick@gmx.net