Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows - Mailing list pgsql-hackers

From Kuntal Ghosh
Subject Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows
Date
Msg-id CAGz5QCJ_Vgn+mBE_ZW31kOa6oT2JME9K8634qhEzPJmU2jX=0A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows  (Kuntal Ghosh <kuntalghosh.2007@gmail.com>)
List pgsql-hackers
On Fri, Mar 24, 2017 at 12:35 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 24 March 2017 at 14:07, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
>> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>> <kuntalghosh.2007@gmail.com> wrote:
>>> Hello,
>>> In Windows, if one needs to take a dump in plain text format (this is
>>> the default option, or can be specified using -Fp) with some level of
>>> compression (-Z[0-9]), an output file has to
>>> be specified. Otherwise, if the output is redirected to stdout, it'll
>>> create a corrupted dump (cmd is set to ASCII mode, so it'll put
>>> carriage returns in the file).
>> To reproduce the issue, please use the following command in windows cmd:
>>
>> pg_dump -Z 9 test > E:\test_xu.backup
>> pg_dump -Fp -Z 9 test > E:\test_xu.backup
>
> This is a known problem. It is not specific to PostgreSQL, it affects
> any software that attempts to use stdin/stdout on Windows via cmd,
> where it is not 8-bit clean.
>
> We don't just refuse to run with stdout as a destination because it's
> perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
> as I know, tell whether it's being invoked by cmd or something else.
ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
mode, cmd pokes its nose and does CR / LF conversions on its own. So,
whenever we want compression on a plain-text dump file, we can set the
stdout mode to O_BINARY. Is it a wrong approach?

> If you have concrete ideas on how to improve this they'd be welcomed.
> Is there anywhere you expected to find info in the docs? Do you know
> of a way to detect in Windows if the output stream is not 8-bit clean
> from within the application program? ... other?
Actually, I'm not that familiar with windows environment. But, I
couldn't find any note to user in pg_dump documentation regarding the
issue. In cmd, if someone needs a plain-text dump in compressed
format, they should specify the output file, otherwise they may run
into the above problem. However, if a dump is corrupted due to the
above issue, a fix for that is provided in [1]. Should we include this
in the documentation?



[1] http://www.gzip.org/
Use fixgz.c to remove the extra CR (carriage return) bytes.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Next
From: Michael Banck
Date:
Subject: Re: [HACKERS] Logical replication existing data copy