Using multi-locale support in glibc - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Using multi-locale support in glibc
Date
Msg-id 20050901155741.GE28062@svana.org
Whole thread Raw
Responses Re: Using multi-locale support in glibc
List pgsql-hackers
Browsing the glibc stuff for locales I noticed that glibc does actually
allow you to specify the collation order to strcoll and friends. The
feature is however marked with:
  Attention: all these functions are *not* standardized in any form.  This is a proof-of-concept implementation.

They do however work fine. I used my taggedtypes module to create a
type that binds the collation order to the text strings and the results
can be seen below.

1. Is something supported by glibc usable for us (re portability to
non-glibc platforms)?

2. Should we be trying to use an interface that's specifically marked
as unstable?

3. What's the plan to support multiple collate orders? There was a
message about it last year but I don't see much progress.

4. It makes some things more difficult. For example, my database is
UNICODE and until I specified a UTF8 locale it didn't come out right.
AFAIK the only easy way to determine if something is UTF8 compatable is
to use locale -k charmap. The C interface is hidden. It should be
possible to compile a list of locales and allow only ones matching the
database. Or automatically convert the strings, the conversion
functions exist.

5. Maybe we should evaluate the interface and give feedback to the
glibc developers to see if it can be made more stable.

If you want to have a look to see what's available, use:
rgrep -3 locale_t /usr/include/ |less

Have a nice day,

PS. The code to test this can be found at:
http://svana.org/kleptog/pgsql/taggedtypes.html

--- TEST OUTPUT ---

test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'C' );strings
---------Test2Tést1Tëst1test1tèst2
(5 rows)

test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'en_US' );strings
---------Tëst1Tést1tèst2test1Test2
(5 rows)

test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'nl_NL' );
ERROR:  Locale 'nl_NL' not supported by library
test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'en_AU.UTF-8' );strings
---------test1Tést1Tëst1Test2tèst2
(5 rows)
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Version number in psql banner
Next
From: Alvaro Herrera
Date:
Subject: Re: Remove xmin and cmin from frozen tuples