Thread: pg_dump - lost synchronization with server: got message type "d", length 6036499

Can someone shed some light on what's happening here?

D:\backups>pg_dump -Z 9 -Fc -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
pg_dump: Dumping the contents of table "uploadeddatafiles" failed:
PQgetCopyData
() failed.
pg_dump: Error message from server: lost synchronization with server:
got messag
e type "d", length 6036499
pg_dump: The command was: COPY public.uploadeddatafiles (id, username,
projectid
, aspsession, filename, filetype, filesize, filedata, uploadedon,
"timestamp") T
O stdout;

pg 8.3.1 server on w2k3
pg_dump 8.3.1 and 8.3.3 on the same machine both do the same thing.

data directory and backup directory excluded from virus scanner.

The followinng variations also failed.  The length seems to be either
6036499 or 8435588 (84... is most common).
pg_dump -Z 9 -Fp -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
pg_dump -Fp -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2

uploadeddatafiles holds excel spreadsheets in the filedata column.

sheepcrc2=# select count(*) from uploadeddatafiles;
 count
-------
   405
(1 row)

sheepcrc2=# select sum(length(filedata)) from uploadeddatafiles;
    sum
-----------
 271067619
(1 row)

This completes and returns all rows
select id, md5(filedata) from uploadeddatafiles

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Klint Gore <kgore4@une.edu.au> writes:
> Can someone shed some light on what's happening here?
> D:\backups>pg_dump -Z 9 -Fc -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
> pg_dump: Dumping the contents of table "uploadeddatafiles" failed:
> PQgetCopyData
> () failed.
> pg_dump: Error message from server: lost synchronization with server:
> got messag
> e type "d", length 6036499

Hmm ... I think what is actually happening here is that pg_dump doesn't
have enough memory available to buffer the message.  The only places
where libpq could report that error text with those parameters are
where pqCheckInBufferSpace has failed to enlarge the input buffer
sufficiently.  Per the code comment:

                /*
                 * XXX add some better recovery code... plan is to skip over
                 * the message using its length, then report an error. For the
                 * moment, just treat this like loss of sync (which indeed it
                 * might be!)
                 */

6 meg doesn't seem particularly enormous though.  Are you running
pg_dump under some especially restrictive user limits?  Maybe it's
dying here after having leaked a lot of memory for some other reason
--- try watching the pg_dump process size while it runs.

            regards, tom lane

Tom Lane wrote:
> Klint Gore <kgore4@une.edu.au> writes:
> > Can someone shed some light on what's happening here?
> > D:\backups>pg_dump -Z 9 -Fc -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
> > pg_dump: Dumping the contents of table "uploadeddatafiles" failed:
> > PQgetCopyData
> > () failed.
> > pg_dump: Error message from server: lost synchronization with server:
> > got messag
> > e type "d", length 6036499
>
> 6 meg doesn't seem particularly enormous though.  Are you running
> pg_dump under some especially restrictive user limits?  Maybe it's
> dying here after having leaked a lot of memory for some other reason
> --- try watching the pg_dump process size while it runs.
>
>
I'm running it under my own account which has adminstrator rights.

Peak memory usage was about 540m which brought the total usage for the
machine to about half the physical memory allocated (3g total).

Is there a binary debug build for win32 somewhere?

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Klint Gore <kgore4@une.edu.au> writes:
> Tom Lane wrote:
>> Maybe it's
>> dying here after having leaked a lot of memory for some other reason
>> --- try watching the pg_dump process size while it runs.

> Peak memory usage was about 540m which brought the total usage for the
> machine to about half the physical memory allocated (3g total).

Well, that might well explain the failure.  pg_dump does suck a lot of
schema information into memory at startup, but 540m seems excessive.
Maybe you've found a memory leak in pg_dump (it wouldn't be the first
one).  Does this database have a particularly large number of objects?

            regards, tom lane

Tom Lane wrote:
> Klint Gore <kgore4@une.edu.au> writes:
> > Tom Lane wrote:
> >> Maybe it's
> >> dying here after having leaked a lot of memory for some other reason
> >> --- try watching the pg_dump process size while it runs.
>
> > Peak memory usage was about 540m which brought the total usage for the
> > machine to about half the physical memory allocated (3g total).
>
> Well, that might well explain the failure.  pg_dump does suck a lot of
> schema information into memory at startup, but 540m seems excessive.
> Maybe you've found a memory leak in pg_dump (it wouldn't be the first
> one).  Does this database have a particularly large number of objects?
>
>             regards, tom lane
>
>
I wouldn't call it large - 27 tables, 111 functions,  21 custom types
(used for set returning function results).

The biggest row count table has about 200k records (structure is
int,int,timestamp)

The biggest physical table is the one thats failiing.   The table itself
is physically 81m and its toast table is 82m.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Klint Gore <kgore4@une.edu.au> writes:
> Tom Lane wrote:
>> Maybe you've found a memory leak in pg_dump (it wouldn't be the first
>> one).  Does this database have a particularly large number of objects?

> I wouldn't call it large - 27 tables, 111 functions,  21 custom types
> (used for set returning function results).

[ squint... ]  Hard to see how that could be eating half a gig of
pg_dump memory space.

Would you be willing to send me a "pg_dump -s" (ie, just schema no
data) dump of this DB?  Off-list of course.

            regards, tom lane

Tom Lane wrote:
> Klint Gore <kgore4@une.edu.au> writes:
> > Tom Lane wrote:
> >> Maybe you've found a memory leak in pg_dump (it wouldn't be the first
> >> one).  Does this database have a particularly large number of objects?
>
> > I wouldn't call it large - 27 tables, 111 functions,  21 custom types
> > (used for set returning function results).
>
> [ squint... ]  Hard to see how that could be eating half a gig of
> pg_dump memory space.
>
> Would you be willing to send me a "pg_dump -s" (ie, just schema no
> data) dump of this DB?  Off-list of course.
>
>
Upon more investigation pg_dump 8.3.1 on linux (ubuntu feisty) across
the network completes properly.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Klint Gore <kgore4@une.edu.au> writes:
> Tom Lane wrote:
>> Would you be willing to send me a "pg_dump -s" (ie, just schema no
>> data) dump of this DB?  Off-list of course.
>>
> attached.  created with pg_dump 8.3.3 win32 to the 8.3.1 win32 server.

Thanks.  When I reload this here and try to pg_dump it, I see no evidence
of any problem --- pg_dump completes pretty quickly and seems to eat
no more than a megabyte or so.  I'm not sure what to make of that.
Could there be a Windows-specific memory problem in pg_dump?

            regards, tom lane

On Thu, Jul 3, 2008 at 1:49 AM, Klint Gore <kgore4@une.edu.au> wrote:

> I'm running it under my own account which has adminstrator rights.
> Peak memory usage was about 540m which brought the total usage for the
> machine to about half the physical memory allocated (3g total).
> Is there a binary debug build for win32 somewhere?

If you're using 8.3.x, we include an archive of symbol files alongside
the installer on the download site.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Tom Lane wrote:
> Klint Gore <kgore4@une.edu.au> writes:
> > Tom Lane wrote:
> >> Would you be willing to send me a "pg_dump -s" (ie, just schema no
> >> data) dump of this DB?  Off-list of course.
> >>
> > attached.  created with pg_dump 8.3.3 win32 to the 8.3.1 win32 server.
>
> Thanks.  When I reload this here and try to pg_dump it, I see no evidence
> of any problem --- pg_dump completes pretty quickly and seems to eat
> no more than a megabyte or so.  I'm not sure what to make of that.
> Could there be a Windows-specific memory problem in pg_dump?
>
Any more thoughts on this?

The file being produced when dumping the table in question is about 1
gig when it crashes.  In the last check-in for fe-protocol3.c, there's
notes about 1 gig strings so I'm wondering if my data when received by
pg_dump (or output by the 8.3.1 server) goes over?  270m of bytea
converted to \xxx octal strings would make something > 1 gig.  But the
270m is spread over 400 rows with nothing bigger than 3m.

If there's no interest in researching this any further, then I'll stop
the server and take a file system copy of the data directory and upgrade
it to 8.3.3 and pray the problem goes away.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au