Thread: character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

From
Andreas Kalsch
Date:
The function "convert_to(string text, dest_encoding name)" will throw an
error and so break my application when not supported characters are
included in the unicode string.
So what can I do
- to filter characters out which have no counterpart in the latin codesets
- or to simple ignore wrong characters?

Problem: Users will enter _any_ characters in my application and an
error really doesn't help in this case.

What I am searching for is a function to undiacritic special letters to
simple ones.

There is provided an example -
http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Diacritic_removing
- which will not work because of the error, when I put _any_ valid UTF8
character to the functions.

Best,

Andi

Re: character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

From
Sam Mason
Date:
On Sun, Aug 02, 2009 at 08:45:52PM +0200, Andreas Kalsch wrote:
> Problem: Users will enter _any_ characters in my application and an
> error really doesn't help in this case.

Then why don't you stop converting to LATIN2?

> What I am searching for is a function to undiacritic special letters to
> simple ones.

It would be easy to write a regex to strip out the invalid characters if
that's what you want.

--
  Sam  http://samason.me.uk/