Bruce Momjian wrote:
> One nice solution would be if iconv would report the lines with
> errors and you could correct them, but I see no way to do that. The
> only thing you could do is to diff the old and new files to see the
> problems. Is that helpful? Here is new text I have used:
I think this is nice. It users see a big mess, they will have to clean
it up by hand anyway.
How about this for better wording:
diff -u -3 -p -r1.400.2.4 release.sgml
--- doc/src/sgml/release.sgml 6 Dec 2005 20:26:02 -0000 1.400.2.4
+++ doc/src/sgml/release.sgml 6 Dec 2005 20:44:26 -0000
@@ -528,15 +528,16 @@ psql -t -f fixseq.sql db1 | psql -e db1
<listitem> <para>
- Some users are having problems loading <literal>UTF8</> data into
- 8.1.X. This is because previous versions allowed invalid <literal>UTF8</>
+ Some users are having problems loading UTF-8 data into
+ 8.1.X. This is because previous versions allowed invalid UTF-8 byte sequences to be entered into the
database,and this release
- properly accepts only valid <literal>UTF8</> sequences. One
- way to correct a dumpfile is to use <command>iconv -c -f UTF-8 -t UTF-8
+ properly accepts only valid UTF-8 sequences. One
+ way to correct a dumpfile is to run the command <command>iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql
dumpfile.sql</>. The <literal>-c</> option removes invalid character sequences. A diff of the two files will
showthe sequences that are invalid. <command>iconv</> reads the entire input
- file into memory so it might be necessary to <command>split</> the dump
+ file into memory so it might be necessary to use <command>split</>
+ to break up the dump into multiple smaller files for processing. </para> </listitem>
--
Peter Eisentraut
http://developer.postgresql.org/~petere/