Re: Reducing the overhead of NUMERIC data - Mailing list pgsql-hackers

From Gregory Maxwell
Subject Re: Reducing the overhead of NUMERIC data
Date
Msg-id e692861c0511041615y3d7d7e13n81a004dcbafd67c7@mail.gmail.com
Whole thread Raw
In response to Re: Reducing the overhead of NUMERIC data  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Reducing the overhead of NUMERIC data  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-hackers
On 11/4/05, Martijn van Oosterhout <kleptog@svana.org> wrote:
[snip]
> : ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not
> : support surrogates, and UTF-16 does support surrogates. This means
> : that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The
> : notion of UCS-2 is deprecated and dead. Unicode 2.0 in 1996 changed
> : its default encoding to UTF-16.
> <snip>

This means it's fine.. ICU's use of UTF-16 will not break our support
for all of unicode. Conversion too and from UTF-16 isn't cheap,
however, if you're doing it all the time. Storing ASCII in UTF-16 is
pretty lame. Widespread use of UTF-16 tends to hide bugs in the
handling of non-bmp characters. ...  I would be somewhat surprised to
see a substantial performance difference in working with UTF-16 data
over UTF-8, but then again ... they'd know and I wouldn't.

Other lame aspects of using unicode encodings other than UTF-8
internally is that it's harder to figure out what is text in GDB
output and such.. can make debugging more difficult.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Seeing context switch storm with 10/13 snapshot of
Next
From: Tom Lane
Date:
Subject: Re: Possible problem with pg_reload_conf() and view pg_settings