Re: Encoding, Unicode, locales, etc. - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: Encoding, Unicode, locales, etc.
Date
Msg-id 20061101195030.GA4445@svana.org
Whole thread Raw
In response to Re: Encoding, Unicode, locales, etc.  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
Responses Re: Encoding, Unicode, locales, etc.
List pgsql-general
On Wed, Nov 01, 2006 at 11:41:43AM +0100, Karsten Hilbert wrote:
> Could this paragraph be put into the docs and/or the FAQ,
> please ? Along with the recommendation that if you require
> multiple encodings for your databases you better had your OS
> locale configured properly for UTF8 and use UNICODE
> databases or do initdb with the C-locale.

Err, multiple encodings don't work full-stop. Any particular locale (as
defined by POSIX) is only really designed to work with one encoding.
The fact that the C locale produces an order when sorting UTF8 text is
really just luck.

In hindsight the people in POSIX who decided to tie locale and encoding
into one variable should probably be shot, but it's a bit late now.

> > This stuff is certainly far from ideal, but the amount of work involved
> > to fix it is daunting; see many past pg-hackers discussions.
>
> Here are a few data points from my Debian/Testing system in
> favour of not worrying too much about installed ICU size as
> it is being used by other packages anyways:

We'd need a suitable patch first before we start worrying about that. I
think diskspace is less of an issue now. There are discussions going on
about having the clog and the xlog taking dozens of megabytes. At the
end of the day I don't think 10MB for the Unicode data it going to be
that big a deal, *if* the patch solves all the problems in this area in
a reasonably clean way...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Encoding, Unicode, locales, etc.
Next
From: "pgsql-general@list.coretech.ro"
Date:
Subject: time value '24:00:00'