Thread: convert accented character to base character

convert accented character to base character

From
"Neubert Joachim"
Date:

I want to convert accented characters to the according base character, e.g. "Ü" or "Ú" to "U".

 

Is there a way to do this with pgsql functions?

 

 

  postgres=# select convert('Ü', 'UTF8', 'SQL_ASCII');

   convert

  ----------

   \303\234

 

did not work as I had hoped.

 

Any help would be appreciated -

 

Cheers, Joachim

Re: convert accented character to base character

From
Thom Brown
Date:
On 28 June 2010 16:22, Neubert Joachim <J.Neubert@zbw.eu> wrote:
> I want to convert accented characters to the according base character, e.g.
> "Ü" or "Ú" to "U".
>
>
>
> Is there a way to do this with pgsql functions?
>
>
>
>
>
>   postgres=# select convert('Ü', 'UTF8', 'SQL_ASCII');
>
>    convert
>
>   ----------
>
>    \303\234
>
>
>
> did not work as I had hoped.
>
>
>
> Any help would be appreciated -
>
>
>
> Cheers, Joachim

There's a function called unaccent coming in PostgreSQL 9.0:
http://www.postgresql.org/docs/9.0/static/unaccent.html

But in the meantime, you could try this:

CREATE OR REPLACE FUNCTION unaccent_string(text) RETURNS text AS $$
DECLARE
    input_string text := $1;
BEGIN

input_string := translate(input_string,
'âãäåāăąÁÂÃÄÅĀĂĄèééêëēĕėęěĒĔĖĘĚìíîïìĩīĭÌÍÎÏÌĨĪĬóôõöōŏőÒÓÔÕÖŌŎŐùúûüũūŭůÙÚÛÜŨŪŬŮ',
'aaaaaaaaaaaaaaaeeeeeeeeeeeeeeeiiiiiiiiiiiiiiiiooooooooooooooouuuuuuuuuuuuuuuu');

return input_string;
END;
$$ LANGUAGE plpgsql;

Then you can do:

select unaccent_string('Ü');

Someone else may have a better suggestion though.

Regards

Thom

Re: convert accented character to base character

From
John R Pierce
Date:
On 06/28/10 8:22 AM, Neubert Joachim wrote:
>
> I want to convert accented characters to the according base character, 
> e.g. "Ü" or "Ú" to "U".
>
> Is there a way to do this with pgsql functions?
>
>


translate(somestring, 'ÙÚÛÜŨŪŬŮŰŲùúûüũūŭůűų', 'UUUUUUUUUUuuuuuuuuuu')

of course, expand those with all the other characters you want translated.





Re: convert accented character to base character

From
Thom Brown
Date:
On 28 June 2010 16:33, Thom Brown <thombrown@gmail.com> wrote:
> On 28 June 2010 16:22, Neubert Joachim <J.Neubert@zbw.eu> wrote:
>> I want to convert accented characters to the according base character, e.g.
>> "Ü" or "Ú" to "U".
>>
>>
>>
>> Is there a way to do this with pgsql functions?
>>
>>
>>
>>
>>
>>   postgres=# select convert('Ü', 'UTF8', 'SQL_ASCII');
>>
>>    convert
>>
>>   ----------
>>
>>    \303\234
>>
>>
>>
>> did not work as I had hoped.
>>
>>
>>
>> Any help would be appreciated -
>>
>>
>>
>> Cheers, Joachim
>
> There's a function called unaccent coming in PostgreSQL 9.0:
> http://www.postgresql.org/docs/9.0/static/unaccent.html
>
> But in the meantime, you could try this:
>
> CREATE OR REPLACE FUNCTION unaccent_string(text) RETURNS text AS $$
> DECLARE
>    input_string text := $1;
> BEGIN
>
> input_string := translate(input_string,
> 'âãäåāăąÁÂÃÄÅĀĂĄèééêëēĕėęěĒĔĖĘĚìíîïìĩīĭÌÍÎÏÌĨĪĬóôõöōŏőÒÓÔÕÖŌŎŐùúûüũūŭůÙÚÛÜŨŪŬŮ',
> 'aaaaaaaaaaaaaaaeeeeeeeeeeeeeeeiiiiiiiiiiiiiiiiooooooooooooooouuuuuuuuuuuuuuuu');

That should actually be
'aaaaaaaAAAAAAAAeeeeeeeeeeEEEEEiiiiiiiiIIIIIIIIoooooooOOOOOOOOuuuuuuuuUUUUUUUU');

>
> return input_string;
> END;
> $$ LANGUAGE plpgsql;
>
> Then you can do:
>
> select unaccent_string('Ü');
>
> Someone else may have a better suggestion though.
>
> Regards
>
> Thom
>

Re: convert accented character to base character

From
Mathieu De Zutter
Date:
On 28/06/10 17:22, Neubert Joachim wrote:

I want to convert accented characters to the according base character, e.g. "Ü" or "Ú" to "U".

I have the following:

CREATE OR REPLACE FUNCTION unaccent(TEXT) RETURNS TEXT AS
$$
    use Text::Unaccent;
    return unac_string('UTF8', $_[0]);
$$ LANGUAGE plperlu IMMUTABLE STRICT;

On debian you need the following package:
libtext-unaccent-perl

Kind regards,
Mathieu