Re: [ADMIN] pg_dump bug? - Mailing list pgsql-admin

From Andy Nercessian
Subject Re: [ADMIN] pg_dump bug?
Date
Msg-id CAJWczd-hr9aJkXO4pED8fbWu69b9aHVPGWJupi65gaRHfPHtMw@mail.gmail.com
Whole thread Raw
In response to Re: [ADMIN] pg_dump bug?  (Scott Mead <scottm@openscg.com>)
List pgsql-admin
Thank you, Scott. 
That was my first instinct too, so I explored it by carrying out a manually typed pg_dump into psql. The result was identical. The reason I eventually ruled it out is that the nested copy statements are all identically positioned in each pg_dump. It would require some sort of asynchronous writing on the file for this to happen, and in that event, it seems highly improbable that the locations of all these nested statements would be identical.
There is something even stranger. The restore works fine- i.e. there is no evidence that the additional copy statements are actually carried out. Given this fact, I wouldn't bother with this at all, were it not for the fact that the pg_dump files are many many times larger than they need to be.

On Tue, Apr 11, 2017 at 1:33 PM, Scott Mead <scottm@openscg.com> wrote:


> On Apr 11, 2017, at 05:20, Andy Nercessian <ahnercessian@gmail.com> wrote:
>
> I wonder if anyone has experienced the problem I have with bloated pg_dump files. This problem exists in both standard pg_dumps and --insert-columns type pg_dumps. I am using postgresql 9.5 on Windows 7.
>
> Each table appears to have a 'copy table_name .... from stdin..' in multiple places (hundreds in some cases). The first copy statement appears genuine, but the second and subsequent statements appear to be in the middle of another copy clause between records. The outer copy clause is sometimes from the same table, and sometimes from a different table. Furthermore, from the existence of the number of backslashes \t, \\t, \\\\t which are variable, it would appear as though nesting can be several levels deep.
>
> I'm not copying the actual data because apart from anything else, it's a giant text file. But this is roughly what it looks like:
>
> ------------------------------------
> copy bills from ....
> Record 1 Column 1 \t Record 1 column 2 \t ....
> Record 2 Column 2 \t Record 2 column 2 \t ...
> ......
> copy adverts from .... < this is a nested copy statement unrelated to bills, although occasionally it's another bills copy statement>
>
> Record 88 Column 1 \t Record 88 Column 2 \t ...
>
> -------------------------------------
>
> So in the middle of one copy statement - in between the records being copied, another one mysteriously appears and the records are duplicated. Each record can therefore be found in each of the copy statements, and the total number of such duplications is in the hundreds for some tables.
>
> What could be causing this? I've made only the following observations:
> 1. The nested loops always occur in the same spot within the outer copy statements (if records have not changed between dumps) and are preceded by the exact same column of the same record in each dump.
> 2. It is clear that these copy statements are NOT part of any record.
> 3. The same behaviour exists for insert style dumps (where there is no copy command).
> 4. Each copy statement includes the entire table, NOT a selection of records from it.
>
> In years of using pg_dump I have never come across this issue.

Is it possible that you have more than 1 pg_dump running simultaneously against the same output file?



>
>
> --
> Andy



--
Andy Nercessian
+302109427400

pgsql-admin by date:

Previous
From: Scott Mead
Date:
Subject: Re: [ADMIN] pg_dump bug?
Next
From: "Gone, Sajan"
Date:
Subject: [ADMIN] Postgres replication