Re: Differences in UTF8 between 8.0 and 8.1 - Mailing list pgsql-hackers

From Andrej Ricnik-Bay
Subject Re: Differences in UTF8 between 8.0 and 8.1
Date
Msg-id b35603930510261840v17d6a50dwba1e8dd6012654f7@mail.gmail.com
Whole thread Raw
In response to Re: Differences in UTF8 between 8.0 and 8.1  (Paul Lindner <lindner@inuus.com>)
Responses Re: Differences in UTF8 between 8.0 and 8.1
List pgsql-hackers
> does strip out the invalid characters.  However, iconv reads the
> entire file into memory before it writes out any data.  This is not so
> good for multi-gigabyte dump files and doesn't allow for it to be used
> in a pipe between pg_dump and psql.
>
> Anyone have any other recommendations?  GNU recode might do it, but
> I'm a bit stymied by the syntax.  A quick perl script using
> Text::Iconv didn't work either.  I'm off to look at some other perl
> modules and will try to create a script so I can strip out the invalid
> characters.
How about an ugly kludge  ...

split -a 3 -d -b 1048576 ../path/to/dumpfile dumpfile
for i in `ls -1 dumpfile*`; do   iconv -c -f UTF8 -t UTF8 $i;done
cat dumpfile* > new_dump


Cheers,
Andrej


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", File: "nbtsearch.c", Line: 89)
Next
From: Christopher Kings-Lynne
Date:
Subject: Re: Differences in UTF8 between 8.0 and 8.1