Re: evil characters #bfef cause dump failure - Mailing list pgsql-admin

From Tom Lane
Subject Re: evil characters #bfef cause dump failure
Date
Msg-id 10656.1100552422@sss.pgh.pa.us
Whole thread Raw
In response to Re: evil characters #bfef cause dump failure  (Christian Fowler <spider@viovio.com>)
Responses Re: evil characters #bfef cause dump failure
List pgsql-admin
Christian Fowler <spider@viovio.com> writes:
>   server_encoding
> -----------------
>   SQL_ASCII

> whoa! yikes, I bet this has a lot to do with it? I really wanted to keep
> everything UNICODE end-to-end.  I must have forgotten --encoding on my
> initdb? Anything I can do at this point?

Hmm ... the safe way would be dump-n-reload but that's not working for
you.  What you can try is to alter the pg_database.encoding value for
that database, then start fresh backends (any existing ones won't notice
the change).  Worst case if that doesn't make life good is to change it
back.

The real problem is that you've got invalid unicode data in the database
(I'm not an expert, but I think that #bf is a 1-byte UTF8 sequence and
then #ef starts a 3-byte sequence, so if this comes within 2 characters
of end-of-line that would explain your dump problem).  You had better
fix the data first before trying to lock down the encoding.  Once you
change the encoding, backend internal operations will start spitting up
on any stored bad data, whereas right now it's just passing it through
unchanged.

The safest way might be a dump-n-reload in any case, since reloading
into a fresh UNICODE database will catch bad data.  If you try manual
repairs you're likely to miss some places :-(

            regards, tom lane

pgsql-admin by date:

Previous
From: Christian Fowler
Date:
Subject: Re: evil characters #bfef cause dump failure
Next
From: Markus Bertheau
Date:
Subject: Re: evil characters #bfef cause dump failure