Thread: Re: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

Re: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
Tatsuo Ishii
Date:
> We have a Unicode (UTF-8) database that we are trying to upgrade to 7.1b4.
> We did a pg_dumpall (yes, using the old version) and then tried a restore.
> We hit the following 3 problems:
> 
> 1. Some of the text is large, about 20k characters, and is multiline. For
> almost all of the lines this was fine (postgres put a \ at the end of the
> previos line) but for some it was not. The lines I looked at all had
> non-English characters (Japanese and/or Korean) at the end of the line. When
> the restore encountered these lines it failed and, since the dump uses COPY,
> the entire table was left blank.
> 
> 2. Some two-byte dash/hyphen characters DID get correctly imported into the
> database but could not be read out again via JDBC, that is, when read the
> record was truncated at the character. This _might_ be related to a long
> standing Java core bug regarding improper conversions between certain
> languages and the internal Unicode representation for hyphens.
> 
> 3. One other character, a two-byte apostrophe, was not restoreable,
> similarly to the hyphen problem.
> 
> 
> After fighting the above, I decided to try doing the dump with the -dn
> flags. This fixed problem #1 but not 2 or 3. If needed I can try to get
> details about the problem characters.

This might be related to a known bug with 7.0.x. Can you grab a patch
from ftp://ftp.sra.co.jp/pub/cmd/postgres/7.0.3/patches/copy.patch.gz
and try again?

Or even better, can you give me a minimum set of data that reproduces
your problem?
--
Tatsuo Ishii


RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
"Rainer Mager"
Date:
Well, I tried the patch and the newly produced dump was identical to the bad
dump from before, so the patch had no affect. I will try to trim it down to
a reasonably small file and email it to you.

--Rainer

> -----Original Message-----
> From: pgsql-bugs-owner@postgresql.org
> [mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tatsuo Ishii
> Sent: Friday, February 23, 2001 10:32 AM
> To: rmager@vgkk.com
> Cc: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
> Subject: Re: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore
> This might be related to a known bug with 7.0.x. Can you grab a patch
> from ftp://ftp.sra.co.jp/pub/cmd/postgres/7.0.3/patches/copy.patch.gz
> and try again?
>
> Or even better, can you give me a minimum set of data that reproduces
> your problem?
> --
> Tatsuo Ishii



RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
Tatsuo Ishii
Date:
> Attached is a single INSERT that shows the problem. The character after the
> word "Fiber" truncates the text when using JDBC. NOTE, the text IS in the
> database, that is, the dump/restore seems ok, the problem is when trying to
> read the text later. The database is UTF8 and I just tested with beta 5.
> 
> Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
> retreive it again then everything is fine.

Thanks. I'll dig into it.
--
Tatsuo Ishii


RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
Tatsuo Ishii
Date:
> Attached is a single INSERT that shows the problem. The character after the
> word "Fiber" truncates the text when using JDBC. NOTE, the text IS in the
> database, that is, the dump/restore seems ok, the problem is when trying to
> read the text later. The database is UTF8 and I just tested with beta 5.
> 
> Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
> retreive it again then everything is fine.

I have tested your data using psql:

unicode=# create table pr_prop_info(i1 int, i2 int, i3 int, t text);
CREATE
unicode=# \encoding LATIN1
unicode=# \i example.sql 
INSERT 2378114 1
unicode=# select * from pr_prop_info;

The character after the word "Fiber" looks like "­Optic Cable". So as
long as the server/client encoding set correctly, it looks ok. I guess
we have some problems with JDBC driver. Unfortunately I am not a Java
guru at all. Can anyone look into our JDBC driver regarding this
problem?
--
Tatsuo Ishii


RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
"Rainer Mager"
Date:
Attached is a single INSERT that shows the problem. The character after the
word "Fiber" truncates the text when using JDBC. NOTE, the text IS in the
database, that is, the dump/restore seems ok, the problem is when trying to
read the text later. The database is UTF8 and I just tested with beta 5.

Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
retreive it again then everything is fine.


--Rainer

> -----Original Message-----
> From: pgsql-bugs-owner@postgresql.org
> [mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tatsuo Ishii
> Sent: Friday, February 23, 2001 10:32 AM
>
> Or even better, can you give me a minimum set of data that reproduces
> your problem?
> --
> Tatsuo Ishii

Attachment

Problems with Multibyte in 7.1 beta?

From
"Rainer Mager"
Date:
I'm trying to run the latest CVS code's regression tests and have a problem.
They fail at initdb with this:


Running with noclean mode on. Mistakes will not be cleaned up.
/opt/home/rmager/devel/External/pgsql/src/test/regress/./tmp_check/install//
usr/local/pgsql/bin/pg_encoding: erro
r while loading shared libraries:
/opt/home/rmager/devel/External/pgsql/src/test/regress/./tmp_check/install//
usr
/local/pgsql/bin/pg_encoding: undefined symbol: pg_char_to_encoding
initdb: pg_encoding failed

Perhaps you did not configure PostgreSQL for multibyte support or
the program was not successfully installed.




I ran configure with this:

./configure --enable-multibyte --enable-syslog --with-java




Any ideas?

--Rainer



Dead locks

From
"Rainer Mager"
Date:
Hi all,
We're using PG 7.0 and 7.1beta and are having dead lock problems. The docs
say the Postgres detects dead locks and automatically rolls back 1
transaction to recover but this is not our experience. Are the docs
incorrect or is this more serious?


Thanks,

--Rainer



Re: Dead locks

From
Tom Lane
Date:
"Rainer Mager" <rmager@vgkk.com> writes:
> We're using PG 7.0 and 7.1beta and are having dead lock problems. The docs
> say the Postgres detects dead locks and automatically rolls back 1
> transaction to recover but this is not our experience. Are the docs
> incorrect or is this more serious?

Which beta release?

There are some known undetected-deadlock cases in 7.0, which were
repaired in late January --- that would have been beta4 or possibly
beta5, I forget now.  If you still see this behavior with 7.1RC1 then
I would like details.
        regards, tom lane


RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
"Rainer Mager"
Date:
I just tested a bug I originally fount in 7.1b4 with the new 7.1RC3 and it
still exists. I would consider this a major bug because I know of no work
around.

Basically what happens is that a dump of an existing Unicode database (from
7.03) has a double-byte hyphen character that becomes \255 in the dump. When
the data is imported into the new 7.1 database it seems to correctly appear
(verified via psql) BUT when reading this record via JDBC the data is
truncated at this character.

I communicated briefly with Ishii-san regarding this a while back but I
never followed up. Considering RC3 is now out I thought I should revisit the
issue. It should be easy to test by editing and postgres Unicode database
dump and putting \255 somewhere in a string. I'm not sure if it matters but
the dump was done with "-dn" flags.

Thanks,

--Rainer


> -----Original Message-----
> From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
> Sent: Wednesday, February 28, 2001 11:02 AM
> To: rmager@vgkk.com
> Cc: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
> Subject: RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore
>
>
> > Attached is a single INSERT that shows the problem. The
> character after the
> > word "Fiber" truncates the text when using JDBC. NOTE, the text
> IS in the
> > database, that is, the dump/restore seems ok, the problem is
> when trying to
> > read the text later. The database is UTF8 and I just tested with beta 5.
> >
> > Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
> > retreive it again then everything is fine.
>
> I have tested your data using psql:
>
> unicode=# create table pr_prop_info(i1 int, i2 int, i3 int, t text);
> CREATE
> unicode=# \encoding LATIN1
> unicode=# \i example.sql
> INSERT 2378114 1
> unicode=# select * from pr_prop_info;
>
> The character after the word "Fiber" looks like "­Optic Cable". So as
> long as the server/client encoding set correctly, it looks ok. I guess
> we have some problems with JDBC driver. Unfortunately I am not a Java
> guru at all. Can anyone look into our JDBC driver regarding this
> problem?
> --
> Tatsuo Ishii



RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
"Rainer Mager"
Date:
I noticed that 7.1 has officially been released. Does anyone know the status
of the bug I reported regarding encoding problems when dumping a 7.0 db an
restoring on 7.1?


Thanks,

--Rainer



RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

From
"Rainer Mager"
Date:
Hi,
I'm trying to see if I can patch this bug myself because we are under some
time constraints. Can anyone give me a tip regarding where in the postgres
source the internal UTF-8 code is converted during a dump?
I believe that the character 0xAD is a ASCII character that looks like a
dash. According to the UTF-8 spec, anything over 0x7F requires another byte
with it (which, I think, means that you should never see the 0xAD character
by itself in a postgres dump, but I am seeing this). So, I'm guessing that
some piece of the UTF-8 conversion routine is a bit off.
Any tips on where to start? I would try to hack a fix by searching for the
offending character in the dump and replacing it with  a normal dash but
unfortunately 0xAD is a valid byte when paired with other bytes and these
also exist in our dump.


--Rainer

> -----Original Message-----
> From: pgsql-bugs-owner@postgresql.org
> [mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Rainer Mager
> Sent: Monday, April 16, 2001 12:15 PM
> To: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
> Subject: RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore
>
>
> I noticed that 7.1 has officially been released. Does anyone know
> the status
> of the bug I reported regarding encoding problems when dumping a 7.0 db an
> restoring on 7.1?
>
>
> Thanks,
>
> --Rainer
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)