Thread: BUG #7590: Data corruption using pg_dump only with -Z parameter

BUG #7590: Data corruption using pg_dump only with -Z parameter

From
hrtlik@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      7590
Logged by:          Jan Vodi=C4=8Dka
Email address:      hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system:   Windows 8
Description:        =


"pg_dump -Z1 my_db > backup" always make corrupted package.
When I try it on postgres database which created from installation: "pg_dump
postgres > backup" it is ok.
I can reproduce it everytime.

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Tom Lane
Date:
hrtlik@gmail.com writes:
> The following bug has been logged on the website:
> Bug reference:      7590
> Logged by:          Jan Vodička
> Email address:      hrtlik@gmail.com
> PostgreSQL version: 9.2.1
> Operating system:   Windows 8
> Description:

> "pg_dump -Z1 my_db > backup" always make corrupted package.

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text.  There's not a lot we can do
to prevent that.  Try it like this instead:

    pg_dump -Z1 -f backup.gz my_db

to keep the data away from Windows' interference.

            regards, tom lane

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Ryan Kelly
Date:
On Tue, Oct 09, 2012 at 02:20:40PM +0000, hrtlik@gmail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      7590
> Logged by:          Jan Vodička
> Email address:      hrtlik@gmail.com
> PostgreSQL version: 9.2.1
> Operating system:   Windows 8
> Description:
>
> "pg_dump -Z1 my_db > backup" always make corrupted package.
What does this mean? How did you verify that you got a "corrupted
package"?

> When I try it on postgres database which created from installation: "pg_dump
> postgres > backup" it is ok.
> I can reproduce it everytime.

-Ryan Kelly

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Jan Vodička
Date:
So is there any way how to get plain sql from this "corrupted" backup?
It would be nice to mention this behavior in manual.


2012/10/9 Tom Lane <tgl@sss.pgh.pa.us>
hrtlik@gmail.com writes:
> The following bug has been logged on the website:
> Bug reference:      7590
> Logged by:          Jan Vodička
> Email address:      hrtlik@gmail.com
> PostgreSQL version: 9.2.1
> Operating system:   Windows 8
> Description:

> "pg_dump -Z1 my_db > backup" always make corrupted package.

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text.  There's not a lot we can do
to prevent that.  Try it like this instead:

        pg_dump -Z1 -f backup.gz my_db

to keep the data away from Windows' interference.

                        regards, tom lane



--
Jan Vodička

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Jan Vodička
Date:
= not able to unpack, invalid

Try this one generated by "pg_dump -Z1 > backup.gz" in windows: http://mstu.cz/~hrtlik/backup.gz (0.5kB)
original "pg_dump > backup.gz" without compression: http://mstu.cz/~hrtlik/backup.sql

If you have any way how to get original, tell me.



2012/10/9 Ryan Kelly <rpkelly22@gmail.com>
On Tue, Oct 09, 2012 at 02:20:40PM +0000, hrtlik@gmail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      7590
> Logged by:          Jan Vodička
> Email address:      hrtlik@gmail.com
> PostgreSQL version: 9.2.1
> Operating system:   Windows 8
> Description:
>
> "pg_dump -Z1 my_db > backup" always make corrupted package.
What does this mean? How did you verify that you got a "corrupted
package"?

> When I try it on postgres database which created from installation: "pg_dump
> postgres > backup" it is ok.
> I can reproduce it everytime.

-Ryan Kelly



--
Jan Vodička

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Craig Ringer
Date:
On 10/10/2012 02:38 AM, Tom Lane wrote:
> hrtlik@gmail.com writes:
>> The following bug has been logged on the website:
>> Bug reference:      7590
>> Logged by:          Jan Vodička
>> Email address:      hrtlik@gmail.com
>> PostgreSQL version: 9.2.1
>> Operating system:   Windows 8
>> Description:
>
>> "pg_dump -Z1 my_db > backup" always make corrupted package.
>
> On Windows, that doesn't seem terribly surprising: Windows will probably
> do newline munging on the process's stdout, which will corrupt
> compressed data since it's not plain text.

pg_dump might want to refuse to write to stdout when in a non-plain-text
mode on Windows if that's the case.

--
Craig Ringer

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Craig Ringer
Date:
On 10/10/2012 03:07 AM, Jan Vodička wrote:
> = not able to unpack, invalid
>
> Try this one generated by "pg_dump -Z1 > backup.gz" in windows:
> http://mstu.cz/~hrtlik/backup.gz
> <http://mstu.cz/%7Ehrtlik/backup.gz> (0.5kB)
> original "pg_dump > backup.gz" without compression:
> http://mstu.cz/~hrtlik/backup.sql <http://mstu.cz/%7Ehrtlik/backup.sql>
>
> If you have any way how to get original, tell me.

If Tom is right and the issue is end-of-line transformation, in theory
you might be able to un-mungle newlines. The chances of \r\n occurring
naturally in a tiny backup like that are not huge, so any \r\n in the
data probably used to be a raw \n. Taking a copy of the DB and
performing that substitution might get you a usable backup file.

That's replacing all \x0d\x0a sequences with \x0a. Or I might be wrong
and it's \x0d.

This won't work on a larger backup where some \r\n sequences will occur
naturally in compressed binary data. In those you're likely to have a
much, much bigger job ahead of you.

--
Craig Ringer

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Tom Lane
Date:
Craig Ringer <ringerc@ringerc.id.au> writes:
> On 10/10/2012 02:38 AM, Tom Lane wrote:
>> On Windows, that doesn't seem terribly surprising: Windows will probably
>> do newline munging on the process's stdout, which will corrupt
>> compressed data since it's not plain text.

> pg_dump might want to refuse to write to stdout when in a non-plain-text
> mode on Windows if that's the case.

Actually, a look at the pg_dump code says that it does

            setmode(fileno(stdout), O_BINARY);

so either my diagnosis is wrong or there's some reason why that setting
didn't take.

            regards, tom lane

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Jan Vodička
Date:
Thanks. I've already looked. Problem was that Windows replaced '\n' to '\r\n', replacing bytes back '\r\n' -> '\n' solved the problem. It was working on 16GB gzip package.
It should be nice to mention this behavior somewhere.

Jan Vodicka



2012/10/13 Tom Lane <tgl@sss.pgh.pa.us>
Craig Ringer <ringerc@ringerc.id.au> writes:
> On 10/10/2012 02:38 AM, Tom Lane wrote:
>> On Windows, that doesn't seem terribly surprising: Windows will probably
>> do newline munging on the process's stdout, which will corrupt
>> compressed data since it's not plain text.

> pg_dump might want to refuse to write to stdout when in a non-plain-text
> mode on Windows if that's the case.

Actually, a look at the pg_dump code says that it does

                        setmode(fileno(stdout), O_BINARY);

so either my diagnosis is wrong or there's some reason why that setting
didn't take.

                        regards, tom lane



--
Jan Vodička

Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

From
Jan Vodička
Date:
That would be definitely much more comfortable solution.
Problem was really in newlines \n vs. \r\n. Replace \r\n -> \n solved problem.

Thanks a lot.


2012/10/13 Craig Ringer <ringerc@ringerc.id.au>
On 10/10/2012 03:07 AM, Jan Vodička wrote:
= not able to unpack, invalid

Try this one generated by "pg_dump -Z1 > backup.gz" in windows:
http://mstu.cz/~hrtlik/backup.gz
<http://mstu.cz/%7Ehrtlik/backup.gz> (0.5kB)
original "pg_dump > backup.gz" without compression:
http://mstu.cz/~hrtlik/backup.sql <http://mstu.cz/%7Ehrtlik/backup.sql>

If you have any way how to get original, tell me.

If Tom is right and the issue is end-of-line transformation, in theory you might be able to un-mungle newlines. The chances of \r\n occurring naturally in a tiny backup like that are not huge, so any \r\n in the data probably used to be a raw \n. Taking a copy of the DB and performing that substitution might get you a usable backup file.

That's replacing all \x0d\x0a sequences with \x0a. Or I might be wrong and it's \x0d.

This won't work on a larger backup where some \r\n sequences will occur naturally in compressed binary data. In those you're likely to have a much, much bigger job ahead of you.

--
Craig Ringer



--
Jan Vodička