Home > mailing lists

Re: finding rows with invalid characters - Mailing list pgsql-general

From	Dmitriy Igrishin
Subject	Re: finding rows with invalid characters
Date	November 21, 2010 14:56:01
Msg-id	AANLkTik9ApeYg+McXQd25Ag1A2rEbJWKod80WrghGMeG@mail.gmail.com Whole thread Raw
In response to	finding rows with invalid characters (Sim Zacks <sim@compulab.co.il>)
List	pgsql-general

Tree view

Hey Sim,

Maybe this helps:
http://blog.tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_2.html

2010/11/21 Sim Zacks <sim@compulab.co.il>

I am using PG 8.2.17 with UTF8 encoding.
"PostgreSQL 8.2.17 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 (Gentoo 4.1.1)"

One of my tables somehow has invalid characters in it:
ERROR: invalid byte sequence for encoding "UTF8": 0xa9
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
I have already manually found a number of the bad rows by running queries with text functions (upper) between groups of IDs until I found the specific bad row.

1) Is there a quicker way to get a list of all rows with invalid characters
2) Shouldn't the database prevent these rows from being entered in the first place?
3) I have backups of this database (using -Fc) and I noticed that on restore, this table is not restored because of this error. Is there a way to fix the existing backups, or tell the restore to ignore bad rows instead of erroring out the whole table?

Thanks
Sim

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
// Dmitriy.

pgsql-general by date:

From: Trevor Talbot
Date: 21 November 2010, 14:17:41
Subject: Re: [pgsql-www] Forums at postgresql.com.au

From: Tom Lane
Date: 21 November 2010, 15:54:44
Subject: Re: SYSCONFDIR, initdb and postgresql.conf

Re: finding rows with invalid characters - Mailing list pgsql-general

Previous

Next