Thread: RFC: i18n2ascii(TEXT) stored procedure
I've created the following stored procedure to allow me to do international-insensitive text searches, e.g. a search for "Resume" would match the text "Résumé". I wanted to know: a) am I missing any characters that need to be converted? My first (and only language) is English, so I'm in the dark when that is concerned; b) is there a better and/or faster way of implementing this? I don't want searches to bog down (at least too badly) as a result of this. CREATE OR REPLACE FUNCTION i18n2ascii (TEXT) RETURNS TEXT AS ' my ($source) = @_; $source =~ tr/áàâäéèêëíìîïóòôöúùûüÁÀÂÄÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ/aaaaeeeeiiiioooouuuuAAAAEEEEIIIIOOOOUUUU/; return $source; ' LANGUAGE 'plperl'; -- /* Michael A. Nachbaur <mike@nachbaur.com>* http://nachbaur.com/pgpkey.asc*/ "Ah, " said Arthur, "this is obviously some strange usage of the word safe that I wasn't previously aware of. "
Michael A Nachbaur <mike@nachbaur.com> writes: > b) is there a better and/or faster way of implementing this? I > don't want searches to bog down (at least too badly) as a result of > this. Use to_ascii(text), masm=# select to_ascii('áéíóú');to_ascii ----------aeiou (1 row) Regards, Manuel.
On Thursday 25 September 2003 05:06 pm, Manuel Sugawara wrote: > Michael A Nachbaur <mike@nachbaur.com> writes: > > b) is there a better and/or faster way of implementing this? I > > don't want searches to bog down (at least too badly) as a result of > > this. > > Use to_ascii(text), [snip] D'oh! I guess thats what I get for not RTFM. :-) -- /* Michael A. Nachbaur <mike@nachbaur.com>* http://nachbaur.com/pgpkey.asc*/ "Oh no, not again."
Michael A Nachbaur writes: > a) am I missing any characters that need to be converted? In Unicode, any character can be dynamically combined with any number of accent characters, so an enumerated list will never do. -- Peter Eisentraut peter_e@gmx.net
On Thu, 25 Sep 2003, Michael A Nachbaur wrote: > I've created the following stored procedure to allow me to do > international-insensitive text searches, e.g. a search for "Resume" would > match the text "Résumé". > > I wanted to know: > > a) am I missing any characters that need to be converted? My first (and only > language) is English, so I'm in the dark when that is concerned; > b) is there a better and/or faster way of implementing this? I don't want > searches to bog down (at least too badly) as a result of this. > > CREATE OR REPLACE FUNCTION i18n2ascii (TEXT) RETURNS TEXT AS ' > my ($source) = @_; > $source =~ > tr/áàâäéèêëíìîïóòôöúùûüÁÀÂÄÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ/aaaaeeeeiiiioooouuuuAAAAEEEEIIIIOOOOUUUU/; > return $source; > ' LANGUAGE 'plperl'; You could probably accomplish the same thing without using perl via the built in function translate(). Look in the functions-string.html in the 7.3.x documentation. Also, the regex version of substring() is quite powerful.