Re: Need magic to clean strings from unconvertible UTF8 - Mailing list pgsql-general

From Dimitri Fontaine
Subject Re: Need magic to clean strings from unconvertible UTF8
Date
Msg-id m2aalk6nkh.fsf@2ndQuadrant.fr
Whole thread Raw
In response to Need magic to clean strings from unconvertible UTF8  (Andreas <maps.on@gmx.net>)
List pgsql-general
Andreas <maps.on@gmx.net> writes:
> I can find the problematic rows.
> How could I delete every char in a string that can't be converted to
> WIN1252?

  http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_1.html
  http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_2.html

That's using an hand-crafted translate expression, you could also use
the recode library that does a pretty good job. Maybe the easiest way
here would be using some plpythonu procedure using librecode?

  http://packages.debian.org/sid/python-bibtex

Well or the same in plperl… or even easier, process the source files
before importing them?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

pgsql-general by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: migrate from 8.1 to 9.0
Next
From: Vick Khera
Date:
Subject: Re: migrate from 8.1 to 9.0