Re: Upcoming PG re-releases - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Upcoming PG re-releases
Date
Msg-id 200512061926.jB6JQcS23646@candle.pha.pa.us
Whole thread Raw
In response to Re: Upcoming PG re-releases  (Paul Lindner <lindner@inuus.com>)
Responses Re: Upcoming PG re-releases
List pgsql-hackers
I have added your suggestions to the 8.1.X release notes.

---------------------------------------------------------------------------

Paul Lindner wrote:
-- Start of PGP signed section.
> On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> > Neil Conway wrote:
> > > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > > It's been about a month since 8.1.0 was released, and we've found about
> > > > the usual number of bugs for a new release, so it seems like it's time
> > > > for 8.1.1.
> > > 
> > > I think one fix that should be made in time for 8.1.1 is adding a note
> > > to the "version migration" section of the 8.1 release notes describing
> > > the "invalid UTF-8 byte sequence" problems that some people have run
> > > into when upgrading from prior versions. I'm not familiar enough with
> > > the problem or its remedies to add the note myself, though.
> > 
> > Agreed, but I don't understand the problem well enough either.  Does
> > anyone?
> 
> There was a thread a couple of weeks back about this problem.  Here's
> my sample writeup -- I give my permission for anyone to use it as they
> see fit:
> 
> 
> Upgrading UNICODE databases to 8.1
> 
> Postgres 8.1 includes a number of bug-fixes and improvements to
> Unicode and UTF-8 character handling.  Unfortunately previous releases
> would accept character sequences that were not valid UTF-8.  This
> may cause problems when upgrading your database using
> pg_dump/pg_restore resulting in an error message like this:
> 
>   Invalid UNICODE byte sequence detected near byte ...
> 
> To convert your pre-8.1 database to 8.1 you may have to remove and/or
> fix the offending characters.  One simple way to fix the problem is to
> run your pg_dump output through the iconv command like this:
> 
>   iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> 
> The -c flag tells iconv to omit invalid characters from output.
> 
> There is one problem with this.  Most versions of iconv try to read
> the entire input file into memory.  If you dump is quite large you
> will need to split the dump into multiple files and convert each one
> individually.  You must use the -l flag for split to insure that the
> unicode byte sequences are not split.
> 
>    split -l 10000 dump.sql
> 
> Another possible solution is to use the --inserts flag to pg_dump.
> When you load the resulting data dump in 8.1 this will result in the
> problem rows showing up in your error log.
> 
> -- 
> Paul Lindner        ||||| | | | |  |  |  |   |   |
> lindner@inuus.com
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Optimizer oddness, possibly compounded in 8.1
Next
From: Tom Lane
Date:
Subject: Re: Upcoming PG re-releases