I do not object the changing UNICODE->UTF-8, but all these discussions
sound a little bit funny to me.
If you want to blame UNICODE, you should blame LATIN1 etc. as
well. LATIN1(ISO-8859-1) is actually a character set name, not an
encoding name. ISO-8859-1 can be encoded in 8-bit single byte
stream. But it can be encoded in 7-bit too. So when we refer to
LATIN1(ISO-8859-1), it's not clear if it's encoded in 7/8-bit.
--
Tatsuo Ishii
From: Bruce Momjian <pgman@candle.pha.pa.us>
Subject: Re: [HACKERS] UTF8 or Unicode
Date: Mon, 21 Feb 2005 22:08:25 -0500 (EST)
Message-ID: <200502220308.j1M38PV03238@candle.pha.pa.us>
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > I think we just need to _favor_ UTF8.
> >
> > I agree.
> >
> > > The question is where are we
> > > favoring Unicode rather than UTF8?
> >
> > It's the canonical name of the encoding, both in the code and the docs.
> >
> > regression=# create database e encoding 'utf-8';
> > CREATE DATABASE
> > regression=# \l
> > List of databases
> > Name | Owner | Encoding
> > ------------+----------+-----------
> > e | postgres | UNICODE
> > regression | postgres | SQL_ASCII
> > template0 | postgres | SQL_ASCII
> > template1 | postgres | SQL_ASCII
> > (5 rows)
> >
> > As soon as we decide whether the canonical name is "UTF8" or "UTF-8"
> > ;-) we can fix it.
>
> I checked and it looks like "UTF-8" is the correct usage:
>
> http://www.unicode.org/glossary/
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman@candle.pha.pa.us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>