Home > mailing lists

Re: Unicode and unaccent() - Mailing list pgsql-general

From	Mark Borins
Subject	Re: Unicode and unaccent()
Date	May 6, 2005 15:01:31
Msg-id	111540248301@smtp-2.vancouver.ipapp.com Whole thread Raw
In response to	Re: Unicode and unaccent() (Peter Eisentraut <peter_e@gmx.net>)
List	pgsql-general

Tree view

I am not sure how I could encode the characters into UTF-8.

For example, I went to Unicode.org and looked up in the specs for lets say
an â is 00E2.  If I wanted to do search for all names with an â in them how
would I do that?

00E2 into Octal is:  342

So would I do:

Select * from table where name like '%\342%'

This leads to a greater question.
I am trying to convert a Unicode DB to Latin1 because I realized we have
absolutely no reason to be using Unicode.

When I try to restore the back of a Unicode database into Latin1 I am
getting some conversion errors as there are characters in Unicode that
cannot be converted automatically into Latin1.

These are erroneous characters and I would like to find them, I am give the
hex value of the offending character.  For example, 0x00E2, how would I
search for this character?

Thanks in advance for any help.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Peter Eisentraut
Sent: May 6, 2005 2:12 AM
To: Mark Borins
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Unicode and unaccent()

Mark Borins wrote:
> My problem is that the values like \342 are for LATIN1 type encoding.
>  I have tried and failed to get this working using the what I think
> is the Unicode escaping method \u0032 for example.

There is no Unicode escaping method.  You need to encode the characters
into UTF-8 yourself and write out the individual bytes using the octal
escape sequences.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

pgsql-general by date:

From: Scott Marlowe
Date: 06 May 2005, 15:00:29
Subject: Re: SQL History

From: "Jim C. Nasby"
Date: 06 May 2005, 15:07:48
Subject: Re: Slony v. DBMirror

Re: Unicode and unaccent() - Mailing list pgsql-general

Previous

Next