Home > mailing lists

Re: UTF16 surrogate pairs in UTF8 encoding - Mailing list pgsql-hackers

From	Marko Kreen
Subject	Re: UTF16 surrogate pairs in UTF8 encoding
Date	September 8, 2010 04:18:45
Msg-id	AANLkTikiWsunoVFqb0mceH59LvSQf1vt7-QCZeJL5ZGY@mail.gmail.com Whole thread Raw
In response to	Re: UTF16 surrogate pairs in UTF8 encoding (Peter Eisentraut <peter_e@gmx.net>)
Responses	Re: UTF16 surrogate pairs in UTF8 encoding
List	pgsql-hackers

Tree view

On 9/7/10, Peter Eisentraut <peter_e@gmx.net> wrote:
> On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote:
>  > > We combine the surrogate pair components to a single code point and
>  > > encode that in UTF-8.  We don't encode the components separately;
>  > that
>  > > would be wrong.
>  >
>  > Oh, OK.  Should the docs make that a bit clearer?
>
>
> Done.

This is confusing:
(When surrogatepairs are used when the server encoding is <literal>UTF8</>, theyare first combined into a single code
pointthat is then encodedin UTF-8.) 

So something else happens if encoding is not UTF8?

I think this part can be simply removed, it does not add anything.

Or say that surrogate pairs are only allowed in UTF8 encoding.
Reason is that you cannot encode 0..7F codepoints with them,
and only those are allowed to be given numerically.  But this is
already mentioned before.

--
marko

pgsql-hackers by date:

From: Fujii Masao
Date: 08 September 2010, 03:39:48
Subject: Re: Synchronization levels in SR

From: Dean Rasheed
Date: 08 September 2010, 05:00:48
Subject: Re: WIP: Triggers on VIEWs

Re: UTF16 surrogate pairs in UTF8 encoding - Mailing list pgsql-hackers

Previous

Next