Re: [SPAM]-D] How to find broken UTF-8 characters ? - Mailing list pgsql-sql

From Jasen Betts
Subject Re: [SPAM]-D] How to find broken UTF-8 characters ?
Date
Msg-id hreauf$cma$1@reversiblemaps.ath.cx
Whole thread Raw
In response to How to find broken UTF-8 characters ?  (Andreas <maps.on@gmx.net>)
List pgsql-sql
On 2010-04-29, Andreas <maps.on@gmx.net> wrote:
> Hi,
>
> while writing the reply below I found it sounds like beeing OT but it's 
> actually not.
> I just need a way to check if a collumn contains values that CAN NOT be 
> converted from Utf8 to Latin1.
> I tried:
> Select convert_to (my_column::text, 'LATIN1') from my_table;
>
> It raises an error that says translated:
> ERROR:  character 0xe28093 in encoding »UTF8« has no equivalent in »LATIN1«

use a regular expression.
ISO8859-1 is easy, all the caracters a grouped together in unicode so
the regular expression consists of a single inverted range class
SELECT pkey FROM tabname WHERE ( textfield || textfiled2 || textfield3 ) ~ ('[^'||chr(1)||'-'||chr(255)||']');



pgsql-sql by date:

Previous
From: DM
Date:
Subject: Re: problem converting strings to timestamps with time zone
Next
From: Jasen Betts
Date:
Subject: Re: [SPAM]-D] How to find broken UTF-8 characters ?