Re: Win32 unicode vs ICU - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Win32 unicode vs ICU
Date
Msg-id 24642.1124554667@sss.pgh.pa.us
Whole thread Raw
Responses Re: Win32 unicode vs ICU
Re: Win32 unicode vs ICU
List pgsql-hackers
[ moving to -hackers for wider discussion ]

"Magnus Hagander" <mha@sollentuna.net> wrote in
http://archives.postgresql.org/pgsql-patches/2005-08/msg00039.php

>> I've been working with Palles ICU patch to make it work on 
>> win32, and I believe I have it done. While doing it I noticed 
>> that ICU basically converts to UTF16 and back - I previously 
>> thought it worked on UTF8 strings. Based on this I also tried 
>> out an implementation for the win32-unicode problem that does 
>> *not* require ICU. It uses the win32 native functions to map 
>> to utf16 and back, and then to process the text there. And I 
>> got through with much less code than the ICU version, while 
>> doing the same thing.
>>  
>> I am unsure of how to proceed. As I see it there are three paths:
>> 1) Use native win32 functionality only on win32
>> 2) Use ICU functionality only on win32
>> 3) Allow both ICU and native functionality, compile time 
>>    switch --with-icu (same as unix with the ICU patch)

We need to figure out what we're going to do about this.  Given where
we are in the release cycle, I am pretty strongly tempted to just apply
the smaller patch (just map utf8/utf16 using Windows native functions)
for PG 8.1.

I think that ICU would be interesting as the base for a much larger
patch that gets us away from depending on libc's locale support at all
(in particular, getting rid of the "one locale per database" problem).
But it seems like a heck of a big dependency to incur for any lesser goal.

I feel it makes sense to apply the smaller patch in any case, so that
there's a Win32 solution not requiring ICU (ie, I can't see an argument
for doing (2) rather than (3)).

Comments?

Also,

> And anohter question - my native patch touches the same 
> functions as the ICU patch. Can somebody who knows the 
> internals confirm or deny that these are all the required 
> locations, or do we need to modify more?

There is a strxfrm() call in src/backend/utils/adt/selfuncs.c,
which probably needs to be looked at too.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why is lock not released?
Next
From: Alvaro Herrera
Date:
Subject: Re: Win32 unicode vs ICU