I was reviewing this thread about its lack of collation support in freebsd.
As some of you may or may not know the PHP project is also currently working
heavily on unicode support. (For PHP6)
I had the chance to ask Andrei Zmievski of the php project about their
support for unicode. The key items are as follows.
<StormTide> with the new unicode support, is there any support for unicode
collation
<andrei> StormTide, there will be
<StormTide> is it imported by the platform or custom done for php
<StormTide> (cuz freebsd seeems to have issues with its collation support)
<andrei> StormTide, not OS-dependent
<andrei> StormTide, uses CLDR
Should the postgresql project also be looking at CLDR for cross-platform
unicode support?
http://www.unicode.org/cldr/
Kevin McArthur
Digifonica Canada
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Magnus Hagander" <mha@sollentuna.net>
Cc: <pgsql-hackers@postgresql.org>; "Palle Girgensohn" <girgen@pingpong.net>
Sent: Tuesday, August 23, 2005 9:03 AM
Subject: Re: [HACKERS] Win32 unicode vs ICU
>I wrote:
>> (Just looking at it again, the code in convert_string_to_scalar is
>> pretty bogus for multibyte encodings in any case. Possibly we need to
>> rethink the whole approach.)
>
> After studying this some more, I think the code is really so bogus for
> any non-ASCII situation that it's probably not worth worrying about
> too much. It's effectively assuming that the output of strxfrm() is
> still in an ASCII-superset encoding ... but I don't see anything in
> strxfrm's API that guarantees any such thing.
>
> As long as strxfrm() doesn't fail completely for Windows Unicode,
> I'd recommend just leaving this alone. As previously noted, the
> worst that can happen is an estimation error that's bounded by the
> histogram bin size anyhow.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>