Thread: convert accented character to base character
I want to convert accented characters to the according base character, e.g. "Ü" or "Ú" to "U".
Is there a way to do this with pgsql functions?
postgres=# select convert('Ü', 'UTF8', 'SQL_ASCII');
convert
----------
\303\234
did not work as I had hoped.
Any help would be appreciated -
Cheers, Joachim
On 28 June 2010 16:22, Neubert Joachim <J.Neubert@zbw.eu> wrote: > I want to convert accented characters to the according base character, e.g. > "Ü" or "Ú" to "U". > > > > Is there a way to do this with pgsql functions? > > > > > > postgres=# select convert('Ü', 'UTF8', 'SQL_ASCII'); > > convert > > ---------- > > \303\234 > > > > did not work as I had hoped. > > > > Any help would be appreciated - > > > > Cheers, Joachim There's a function called unaccent coming in PostgreSQL 9.0: http://www.postgresql.org/docs/9.0/static/unaccent.html But in the meantime, you could try this: CREATE OR REPLACE FUNCTION unaccent_string(text) RETURNS text AS $$ DECLARE input_string text := $1; BEGIN input_string := translate(input_string, 'âãäåāăąÁÂÃÄÅĀĂĄèééêëēĕėęěĒĔĖĘĚìíîïìĩīĭÌÍÎÏÌĨĪĬóôõöōŏőÒÓÔÕÖŌŎŐùúûüũūŭůÙÚÛÜŨŪŬŮ', 'aaaaaaaaaaaaaaaeeeeeeeeeeeeeeeiiiiiiiiiiiiiiiiooooooooooooooouuuuuuuuuuuuuuuu'); return input_string; END; $$ LANGUAGE plpgsql; Then you can do: select unaccent_string('Ü'); Someone else may have a better suggestion though. Regards Thom
On 06/28/10 8:22 AM, Neubert Joachim wrote: > > I want to convert accented characters to the according base character, > e.g. "Ü" or "Ú" to "U". > > Is there a way to do this with pgsql functions? > > translate(somestring, 'ÙÚÛÜŨŪŬŮŰŲùúûüũūŭůűų', 'UUUUUUUUUUuuuuuuuuuu') of course, expand those with all the other characters you want translated.
On 28 June 2010 16:33, Thom Brown <thombrown@gmail.com> wrote: > On 28 June 2010 16:22, Neubert Joachim <J.Neubert@zbw.eu> wrote: >> I want to convert accented characters to the according base character, e.g. >> "Ü" or "Ú" to "U". >> >> >> >> Is there a way to do this with pgsql functions? >> >> >> >> >> >> postgres=# select convert('Ü', 'UTF8', 'SQL_ASCII'); >> >> convert >> >> ---------- >> >> \303\234 >> >> >> >> did not work as I had hoped. >> >> >> >> Any help would be appreciated - >> >> >> >> Cheers, Joachim > > There's a function called unaccent coming in PostgreSQL 9.0: > http://www.postgresql.org/docs/9.0/static/unaccent.html > > But in the meantime, you could try this: > > CREATE OR REPLACE FUNCTION unaccent_string(text) RETURNS text AS $$ > DECLARE > input_string text := $1; > BEGIN > > input_string := translate(input_string, > 'âãäåāăąÁÂÃÄÅĀĂĄèééêëēĕėęěĒĔĖĘĚìíîïìĩīĭÌÍÎÏÌĨĪĬóôõöōŏőÒÓÔÕÖŌŎŐùúûüũūŭůÙÚÛÜŨŪŬŮ', > 'aaaaaaaaaaaaaaaeeeeeeeeeeeeeeeiiiiiiiiiiiiiiiiooooooooooooooouuuuuuuuuuuuuuuu'); That should actually be 'aaaaaaaAAAAAAAAeeeeeeeeeeEEEEEiiiiiiiiIIIIIIIIoooooooOOOOOOOOuuuuuuuuUUUUUUUU'); > > return input_string; > END; > $$ LANGUAGE plpgsql; > > Then you can do: > > select unaccent_string('Ü'); > > Someone else may have a better suggestion though. > > Regards > > Thom >
On 28/06/10 17:22, Neubert Joachim wrote:
CREATE OR REPLACE FUNCTION unaccent(TEXT) RETURNS TEXT AS
$$
use Text::Unaccent;
return unac_string('UTF8', $_[0]);
$$ LANGUAGE plperlu IMMUTABLE STRICT;
On debian you need the following package:
libtext-unaccent-perl
Kind regards,
Mathieu
I have the following:I want to convert accented characters to the according base character, e.g. "Ü" or "Ú" to "U".
CREATE OR REPLACE FUNCTION unaccent(TEXT) RETURNS TEXT AS
$$
use Text::Unaccent;
return unac_string('UTF8', $_[0]);
$$ LANGUAGE plperlu IMMUTABLE STRICT;
On debian you need the following package:
libtext-unaccent-perl
Kind regards,
Mathieu