Home > mailing lists

Re: [HACKERS] UTF8 or Unicode - Mailing list pgsql-patches

From	Tom Lane
Subject	Re: [HACKERS] UTF8 or Unicode
Date	March 2, 2005 20:54:39
Msg-id	11919.1109786060@sss.pgh.pa.us Whole thread Raw
In response to	Re: [HACKERS] UTF8 or Unicode (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses	Re: [HACKERS] UTF8 or Unicode
List	pgsql-patches

Tree view

Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> The correct encoding name is "UTF-8".

> True, but Peter says the ANSI standard calls it UTF8 so that's what I
> used.

What SQL99 actually says is

         -  UTF8 specifies the name of a character repertoire that consists
            of every character represented by The Unicode Standard Version
            2.0 and by ISO/IEC 10646 UTF-8, where each character is encoded
            using the UTF-8 encoding, occupying from 1 (one) through 6
            octets.

That is, "UTF8" is an identifier chosen to refer to an encoding which
they know perfectly well is really called UTF-8.  We should probably
follow the same convention of using UTF8 in code identifiers and UTF-8
in documentation.  In particular, UTF_8 with an underscore is sanctioned
by nobody and should be avoided.

            regards, tom lane

pgsql-patches by date:

From: Mark Wong
Date: 02 March 2005, 20:04:49
Subject: Re: WIP: buffer manager rewrite (take 2)

From: Bruce Momjian
Date: 02 March 2005, 21:16:27
Subject: Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

Re: [HACKERS] UTF8 or Unicode - Mailing list pgsql-patches

Previous

Next