Re: to_ascii, or some other form of magic transliteration - Mailing list pgsql-general

From Ben
Subject Re: to_ascii, or some other form of magic transliteration
Date
Msg-id FC637029-2586-42B7-888D-E4F3F519DB98@silentmedia.com
Whole thread Raw
In response to Re: to_ascii, or some other form of magic transliteration  (Mike Rylander <mrylander@gmail.com>)
Responses Re: to_ascii, or some other form of magic transliteration
List pgsql-general
Hrm, I must be missing something, because I don't see how this will
transliterate to ASCII?

On Sep 10, 2005, at 5:30 AM, Mike Rylander wrote:

> On 9/9/05, Ben <bench@silentmedia.com> wrote:
>
>> I'm working on a problem that I imagine others have had, which
>> basically
>> boils down to having nice unicode display text that users are
>> going to
>> want to search against without typing it correctly.... e.g. let a
>> search
>> for "sma" match "små". It seems like the best way to do this is to
>> find
>> a magic unicode transliteration mapping function, and then save the
>> ASCII transliterations for searching against.
>>
>>
>
> The simplest solution to this that I've found is to maintain a
> separate column for ASCII-ized version of your text.  The conversion
> can be done automatically using a trigger, and I have one in PL/PERLU
> that I use.  It basically boils down to:
>
> 1) transform unicode text to normal form D
> 2) strip combining non-spacing marks
>
> In modern Perls that looks like:
>
> #--------------
> use Unicode::Normalize;
> my $txt = NFD(shift());
> $txt =~ s/\pM//og;
> return $txt;
> #--------------
>
> Hope that helps!
>
>

pgsql-general by date:

Previous
From: Tony Caduto
Date:
Subject: Re: EMS PostgreSQL Manager vs. TheKompany DataArchitect
Next
From: Michael Fuhr
Date:
Subject: Re: back references using regex