Home > mailing lists

Unicode and unaccent() - Mailing list pgsql-general

From	Mark Borins
Subject	Unicode and unaccent()
Date	May 5, 2005 16:37:43
Msg-id	111532185801@smtp-1.vancouver.ipapp.com Whole thread Raw
Responses	Re: Unicode and unaccent() Re: Unicode and unaccent()
List	pgsql-general

Tree view

I am trying to write an unaccent function because I need to do some queries comparing data that has accents and data that does not.

The encoding on my DB is Unicode, so far I have found an unaccent() function by looking in the mail archives it looks like the following:

CREATE FUNCTION unaccent(text) RETURNS text AS $$

   BEGIN

       RETURN translate($1, '\342\347\350\351\352\364\373', 'aceeeou')

 ;  END;  $$ LANGUAGE plpgsql IMMUTABLE STRICT;

My problem is that the values like \342 are for LATIN1 type encoding. I have tried and failed to get this working using the what I think is the Unicode escaping method \u0032 for example.

Even if someone could help me with the Unicode escaping method that would be useful. For example if I wanted to find a Unicode character 0x00E2 with a select statement how would I?

Something like select * from table where field like ‘%\u00e2%’;

Doesn’t seem to work.

Does anyone have a good method for unaccenting Unicode dbs/characters?

I am using PG7.4 on FC2

Thank you

pgsql-general by date:

From: CSN
Date: 05 May 2005, 16:28:49
Subject: Re: Booleans - Why in Postgres and not in Oracle or Mysql?

From: "A. Mous"
Date: 05 May 2005, 16:47:39
Subject: Re: Postmaster not reporting number of connections correctly

Unicode and unaccent() - Mailing list pgsql-general

Previous

Next