Home > mailing lists

Re: UTF-8 question. - Mailing list pgsql-general

From	Pierre-Frédéric Caillaud
Subject	Re: UTF-8 question.
Date	September 17, 2004 07:40:56
Msg-id	opsegkwdckcq72hf@musicbox Whole thread
In response to	UTF-8 question. ("Richard Connamacher" <rich.n1@indieimage.com>)
List	pgsql-general

Tree view

=> show client_encoding ;
  client_encoding
-----------------
  UNICODE
(1 ligne)
=> select char_length('a'), bit_length('a');
  char_length | bit_length
-------------+------------
            1 |          8
(1 ligne)


# that's an accented "e"
=> select char_length('é'), bit_length('é'); ;
  char_length | bit_length
-------------+------------
            1 |         16        <= two bytes
(1 ligne)


    pg does not simply store utf-8 data, it also understands it if you set
your encoding correctly (ie. initdb to UNICODE and client_encoding too so
that data doesn't get mangled on the way to the db). It will refuse to eat
illegal UTF8 characters too.
    Once you try unicode, all the codepage mess starts to look old...

On Thu, 16 Sep 2004 20:39:48 -0400, Richard Connamacher
<rich.n1@indieimage.com> wrote:

> I'm new to PostgreSQL, and from the looks of it, it's a great database,
> and I'll be using more of it in the future.
>
> I had a quick question if anyone could clear this up. The documentation
> for PostgreSQL (version 7.1, the version this server is using) says that
> it supports multibyte character encodings like Unicode (which implies
> UTF-16 encoding). Later on, the same page says that Unicode is
> represented using UTF-8 encoding. UTF-8 is the 8-bit version of Unicode.
> The multibyte version of Unicode is UTF-16.
>
> So, which is it? If I create a database using Unicode as the encoding,
> will the encoding be UTF-8 (singlebyte) or UTF-16 (multibyte)?
>
> Thanks!
> Rich
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your
>       joining column's datatypes do not match
>

pgsql-general by date:

From: Hadley Willan
Date: 17 September 2004, 05:43:33
Subject: Is it possible to get the 7.4.1 static docs in HTML form anymore?

From: Robert Treat
Date: 17 September 2004, 07:49:27
Subject: Re: Converting varchar() to text

Re: UTF-8 question. - Mailing list pgsql-general

Previous

Next