Re: FW: Win32 unicode vs ICU - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: FW: Win32 unicode vs ICU |
Date | |
Msg-id | 200603210341.k2L3fx717718@candle.pha.pa.us Whole thread Raw |
In response to | FW: Win32 unicode vs ICU ("Magnus Hagander" <mha@sollentuna.net>) |
List | pgsql-patches |
Is this patch moving toward competion? --------------------------------------------------------------------------- Magnus Hagander wrote: > I just realised this mail didn't go through. Probably because it was too > large for -hackers. So: repost to -patches. Sorry about that. If it's a > duplicate, even more sorry, but I couldn't find it in the archives. > > (This may explain that nobody answered me :P) > > //Magnus > > > > -----Original Message----- > > From: Magnus Hagander > > Sent: Sunday, July 31, 2005 2:09 PM > > To: PostgreSQL-development > > Cc: pgsql-hackers-win32@postgresql.org > > Subject: Win32 unicode vs ICU > > > > Hi! > > > > I've been working with Palles ICU patch to make it work on > > win32, and I believe I have it done. While doing it I noticed > > that ICU basically converts to UTF16 and back - I previously > > thought it worked on UTF8 strings. Based on this I also tried > > out an implementation for the win32-unicode problem that does > > *not* require ICU. It uses the win32 native functions to map > > to utf16 and back, and then to process the text there. And I > > got through with much less code than the ICU version, while > > doing the same thing. > > > > I am unsure of how to proceed. As I see it there are three paths: > > 1) Use native win32 functionality only on win32 > > 2) Use ICU functionality only on win32 > > 3) Allow both ICU and native functionality, compile time > > switch --with-icu (same as unix with the ICU patch) > > > > > > The main downsides of ICU vs the native ones are: > > * ICU does not accept win32 locale names. When doing > > setlocale("sv_se"), for example, win32 will return this in > > later calls as "Swedish_Sweden.1252". To get around this in > > the ICU patch, I had to implement a lookup map that converts > > it back to sv_se for ICU. > > > > * ICU is yet another build and runtime dependency, and a > > large one (comes in at 11Mb for the DLL files alone in the > > win32 download) > > > > > > I guess that the main upside of it is that we'd get > > constistent behaviour - in case there are issues with either > > ICU or win32 native they'd otherwise differ. And only one new > > codepath. But we already live with the platform-inconsistency today... > > > > Another upside is that it handles more encodings in ICU - my > > native implementation does *only* UTF8 and relies on existing > > functionality to deal with other encodings. It could of > > course be extended if necessary, but from what I can tell > > UTF8 is the big one. > > > > > > > > I have attached both patches. For the native version, only > > win32_utf8.patch is required. For the ICU version, > > icu_win32.patch is needed and also the files > > localemap.c,localemap.pl, iso639 and iso3166 needs to go in > > src/backend/port/win32. (the localemap needs to be updated to > > do a better-than-linear search, but I wanted to include an example) > > > > > > Thoughts on the options? > > > > > > And anohter question - my native patch touches the same > > functions as the ICU patch. Can somebody who knows the > > internals confirm or deny that these are all the required > > locations, or do we need to modify more? > > > > (I have run simple tests in swedish locale and both behave > > the same and correct, but I'm unsure of exactly how much > > would be affected) > > > > Finally, the win32 patch also changes the normal path to use > > strncoll(). The comment above the function states that we'd > > like to use strncoll but it's not available. Well, on win32 > > it is, so it should provide a speedup on win32. It is > > currently not included in the ICU patch, but should probably > > be included whichever path we'd chose. > > > > > > //Magnus > > Content-Description: win32_utf8.patch [ Attachment, skipping... ] Content-Description: icu_win32.patch [ Attachment, skipping... ] Content-Description: localemap.pl [ Attachment, skipping... ] Content-Description: localemap.c [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian http://candle.pha.pa.us SRA OSS, Inc. http://www.sraoss.com + If your life is a hard drive, Christ can be your backup. +
pgsql-patches by date: