Re: How to remove non-UTF values from a table? - Mailing list pgsql-general

From Howard Cole
Subject Re: How to remove non-UTF values from a table?
Date
Msg-id 4B278E9E.1080907@selestial.com
Whole thread Raw
In response to How to remove non-UTF values from a table?  (Phoenix Kiula <phoenix.kiula@gmail.com>)
Responses Re: How to remove non-UTF values from a table?  (Phoenix Kiula <phoenix.kiula@gmail.com>)
List pgsql-general
Phoenix Kiula wrote:
> An easy question for some I hope.
>
> I have a DB from 8.2 days that when I now dump and try to take into
> the 8.3.7, it gives me errors about utf-8 stuff.
>
> I tried searching this list's archives but could not come up with an answer.
>
> Google returns some sites like these:
> http://sniptools.com/databases/finding-non-utf8-values-in-postgresql -
> but I'm not clear on how to use them.
>
> Following the SQL on this site I could identify some columns that
> contain text like this:
>
>     "Évolution générale de la situation démographique"
>
> So my guess is that the non-English characters were originally not
> getting written in proper utf-8 variants.
>
> Is there any SQL possibility to find these columns and replace them
> with utf-8 equivalents using some postgresql commands? Couldn't find
> anything in the "Strings functions" (chapter 9 of manual).
>
> We're on CentOS.
>
> Thanks!
>
>
My recommendation would be to install the iconv utility and run it on a
plain text (pg_dump -Fp) backup as suggested in the google article - and
then reimport the clean UTF-8.

I am surprised that you managed to install the original backup on 8.3
because it seems to be much more strict on encoding - Unless your
database is not in UTF-8?

Howard
www.selestial.com



pgsql-general by date:

Previous
From: Howard Cole
Date:
Subject: Re: Too many postgres.exe
Next
From: Josh Kupershmidt
Date:
Subject: Re: 8.3 PL/pgSQL comparing arbitrary records