Problem with 7.0.3 dump -> 7.1b4 restore - Mailing list pgsql-bugs

From Rainer Mager
Subject Problem with 7.0.3 dump -> 7.1b4 restore
Date
Msg-id NEBBJBCAFMMNIHGDLFKGKEDPDCAA.rmager@vgkk.com
Whole thread Raw
Responses Re: Problem with 7.0.3 dump -> 7.1b4 restore
List pgsql-bugs
We have a Unicode (UTF-8) database that we are trying to upgrade to 7.1b4.
We did a pg_dumpall (yes, using the old version) and then tried a restore.
We hit the following 3 problems:

1. Some of the text is large, about 20k characters, and is multiline. For
almost all of the lines this was fine (postgres put a \ at the end of the
previos line) but for some it was not. The lines I looked at all had
non-English characters (Japanese and/or Korean) at the end of the line. When
the restore encountered these lines it failed and, since the dump uses COPY,
the entire table was left blank.

2. Some two-byte dash/hyphen characters DID get correctly imported into the
database but could not be read out again via JDBC, that is, when read the
record was truncated at the character. This _might_ be related to a long
standing Java core bug regarding improper conversions between certain
languages and the internal Unicode representation for hyphens.

3. One other character, a two-byte apostrophe, was not restoreable,
similarly to the hyphen problem.


After fighting the above, I decided to try doing the dump with the -dn
flags. This fixed problem #1 but not 2 or 3. If needed I can try to get
details about the problem characters.


Finally, not a bug but, we have written a small perl script that inserts
transactions around every 500 INSERT lines in a PG dump. This speeds up
large restores by about 100 times! Really! I think this might be a good
thing for the dump command to do automatically.


Best regards,

--Rainer

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dump failing on LinuxPPC
Next
From: Tatsuo Ishii
Date:
Subject: Re: Problem with 7.0.3 dump -> 7.1b4 restore