Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents) - Mailing list pgsql-sql

From Patrice Hédé
Subject Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents)
Date
Msg-id Pine.LNX.3.96.980617180612.3678C-100000@paris.ivo.fr
Whole thread Raw
In response to Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents)  (Benedikt Eric Heinen <beh@icemark.ch>)
Responses Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents)  (Benedikt Eric Heinen <beh@icemark.ch>)
List pgsql-sql
On Wed, 17 Jun 1998, Benedikt Eric Heinen wrote:

> > I don't know what you exactly looking for : a specific solution, or a
> > general one. If this is the second case, you have to take care that
> > different languages have different ways for dealing with crippled texts (
> > = without accents...).
>
> Oh well, let me extend the question then, what I am looking for is a
> solution that works for Switzerland, e.g. a country with 4 official
> languages (one of which basically gets ignored) and a 5th "major"
> language. So, I need a search function to look for German, French, Italian
> and English names (I am not doing Rumantsch [the 4th official language in
> Switzerland], as I don't know anything about the language except for that
> only a few thousand people in Switzerland are left actually using it).

Do you mean you have a field with German *and* French *and* Italian *and*
English words in it, and you want people, be they german-, french-,
italian-, english-speaking, to be able to access this field, without
putting accents and all ?

As I said earlier, you may have problems, since `ae' doesn't mean `ä' for
most of these people (except the german-speaking ones), and they may put
`a' instead. As the rules are different among the languages, it's
difficult to have a single solution. However, you *need* a solution.
Maybe I, or others ;) , may help though. Some questions : what is your
interface language (if it's perl, it can be much easier :) ) ? Can it be a
client-side solution, or do you absolutely need a server-side one (which
would then have to be a C function, I think) ?

And then, what kind of conversions do you need ? For example, for French,
I decided that all a, e, i, o, u, y to be equal, which meant :

any of a,A,à,À,æ,Æ,å,Å,â,Â,á,Á,ä,Ä => a,A,à,À,æ,Æ,å,Å,â,Â,á,Á,ä,Ä
etc.

Obviously, in your case, it will be more complex, since `ae' *may* have a
special meaning... (that's where it's getting difficult :( )...

You need to know the rules you want (and can they be different for
different people, different fields, and all...)....

--------------------------
[HACKERS]
By the way, I've looked at the regex source, which had an interesting
concept : [ä[.ae.]] should match ä and ae equally, meaning "ae" as a
single entity.... however, this does not work... maybe because the server
doesn't use any locale... It would be really helpful if we can do
something about l10n for the next release ! At least have something which
can set locales from psql "set locale to 'xx';" (that wouldn't help
for multi-locales queries, but it would be better than nothing).

Patrice




pgsql-sql by date:

Previous
From: Herouth Maoz
Date:
Subject: Re: [SQL] Time related question...
Next
From: Benedikt Eric Heinen
Date:
Subject: Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents)