Home > mailing lists

Re: different sort order in windows and linux version - Mailing list pgsql-general

From	Agent M
Subject	Re: different sort order in windows and linux version
Date	July 2, 2006 13:25:59
Msg-id	910202e1b6b4f76ae4f871e165d4e01f@themactionfaction.com Whole thread Raw
In response to	Re: different sort order in windows and linux version (Martijn van Oosterhout <kleptog@svana.org>)
Responses	Re: different sort order in windows and linux version Re: different sort order in windows and linux version
List	pgsql-general

Tree view

On Jul 2, 2006, at 6:13 AM, Martijn van Oosterhout wrote:
> But I don't think anyone is actually considering importing ICU into the
> postgres source tree, are they?
Why not?

> Size - I'm not sure this is relevent since I don't think we want to
> incorporate it into postgres itself, just let people use it if they
> have it. In any case though, the default dataset is 8MB. This includes
> support for every locale and charset it knows about.
>
> If you drop the conversion stuff (because postgres already has that)
> you're down to about 4MB.
Why would you drop the ICU transcoding support instead of the existing
postgres functions? Why the duplicated effort?

>> Well, the Japanese think that UTF8 is not the solution to all their
>> worries, so they won't be happy with a UTF8-only solution.  Likewise,
>> those of us who only need single-byte character sets won't be very
>> happy
>> with being forced to accept multi-byte processing overhead.
>
> I've not quite understood the japenese problem with Unicode. My
> understanding is that it was primarily due to widespread use of broken
> converters.

Certain Japanese characters cannot make a reliable round-trip through
Unicode. ICU uses UTF-16 as its store, so the Japanese folks won't be
happy with an ICU-only solution. However, it would still be of great
benefit to allow ICU to handle as much as possible, leaving the string
encodings to the encoding experts.

At the very least, it would be great to have ICU to handle encoding on
a per-column basis (perhaps extending the text datatype with encoding
info). Perhaps this would be a decent stopgap solution? The backend
protocol would also need a version bump- currently, it converts all
strings to a single encoding.

¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
agentm@themactionfaction.com
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬

pgsql-general by date:

From: Tom Lane
Date: 02 July 2006, 12:27:01
Subject: Re: pgsql user change to postgres

From: Victor Escobar
Date: 02 July 2006, 13:30:05
Subject: Default directory for postgres user?

Re: different sort order in windows and linux version - Mailing list pgsql-general

Previous

Next