Thread: Bad pg_dump error message

Bad pg_dump error message

From
Mike Christensen
Date:
Is there something that can be done smarter with this error message?


pg_dump: dumping contents of table pages
pg_dump: [tar archiver] archive member too large for tar format
pg_dump: *** aborted because of error


If there's any hard limits (like memory, or RAM) that can be checked
before it spends two hours downloading the data, that might be a
better experience..  I'd be happy to file a bug..  Thanks!!

Mike


Re: Bad pg_dump error message

From
Jeff Janes
Date:
On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote:
> Is there something that can be done smarter with this error message?
>
>
> pg_dump: dumping contents of table pages
> pg_dump: [tar archiver] archive member too large for tar format
> pg_dump: *** aborted because of error

Maybe it could tell you what the maximum allowed length is, for future
reference.

>
> If there's any hard limits (like memory, or RAM) that can be checked
> before it spends two hours downloading the data,

There is no efficient way for it to know for certain in advance how
much space the data will take, until it has seen the data.  Perhaps it
could make an estimate, but that could suffer from both false
positives and false negatives.

The docs for pg_dump do mention it has a 8GB limit for individual
tables.  I don't see how much more than that warning can reasonably be
done.  It looks like it dumps an entire table to a temp file first, so
I guess it could throw the error at the point the temp file exceeds
that size, rather than waiting for the table to be completely dumped
and then attempted to be added to the archive.  But that would break
modularity some, and still you could have dumped 300 7.5GB tables
before getting to that 8.5GB one which causes the error.

Cheers,

Jeff


Re: Bad pg_dump error message

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote:
>> Is there something that can be done smarter with this error message?
>>
>> pg_dump: dumping contents of table pages
>> pg_dump: [tar archiver] archive member too large for tar format
>> pg_dump: *** aborted because of error

> There is no efficient way for it to know for certain in advance how
> much space the data will take, until it has seen the data.  Perhaps it
> could make an estimate, but that could suffer from both false
> positives and false negatives.

Maybe the docs should warn people away from tar format more vigorously.
Unless you actually have a reason to disassemble the archive with tar,
that format has no redeeming social value that I can see, and it
definitely has gotchas.  (This isn't the only one, IIRC.)

            regards, tom lane


Re: Bad pg_dump error message

From
Mike Christensen
Date:
On Mon, Sep 10, 2012 at 9:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote:
>>> Is there something that can be done smarter with this error message?
>>>
>>> pg_dump: dumping contents of table pages
>>> pg_dump: [tar archiver] archive member too large for tar format
>>> pg_dump: *** aborted because of error
>
>> There is no efficient way for it to know for certain in advance how
>> much space the data will take, until it has seen the data.  Perhaps it
>> could make an estimate, but that could suffer from both false
>> positives and false negatives.
>
> Maybe the docs should warn people away from tar format more vigorously.
> Unless you actually have a reason to disassemble the archive with tar,
> that format has no redeeming social value that I can see, and it
> definitely has gotchas.  (This isn't the only one, IIRC.)

Gotcha.  I ended up just using "plain" format which worked well, even
though the file was about 60 gigs and I had to clear out some hard
disk space first.

Is the TAR format just the raw SQL commands, just tar'ed and then sent
over the wire?  It'd be cool if there was some compressed "binary"
backup of a database that could be easily downloaded, or even better,
a way to just move an entire database between server instances in one
go..  Maybe there is a tool that does that, I just don't know about it
:)

Anyway, I'm all upgraded to 9.2.  Decided I might as well since I'm
launching my site in 3 weeks, and won't get another chance to upgrade
for a while..

Mike


Re: Bad pg_dump error message

From
Mike Christensen
Date:
On Mon, Sep 10, 2012 at 10:06 PM, Mike Christensen <mike@kitchenpc.com> wrote:
> On Mon, Sep 10, 2012 at 9:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jeff Janes <jeff.janes@gmail.com> writes:
>>> On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote:
>>>> Is there something that can be done smarter with this error message?
>>>>
>>>> pg_dump: dumping contents of table pages
>>>> pg_dump: [tar archiver] archive member too large for tar format
>>>> pg_dump: *** aborted because of error
>>
>>> There is no efficient way for it to know for certain in advance how
>>> much space the data will take, until it has seen the data.  Perhaps it
>>> could make an estimate, but that could suffer from both false
>>> positives and false negatives.
>>
>> Maybe the docs should warn people away from tar format more vigorously.
>> Unless you actually have a reason to disassemble the archive with tar,
>> that format has no redeeming social value that I can see, and it
>> definitely has gotchas.  (This isn't the only one, IIRC.)
>
> Gotcha.  I ended up just using "plain" format which worked well, even
> though the file was about 60 gigs and I had to clear out some hard
> disk space first.
>
> Is the TAR format just the raw SQL commands, just tar'ed and then sent
> over the wire?  It'd be cool if there was some compressed "binary"
> backup of a database that could be easily downloaded, or even better,
> a way to just move an entire database between server instances in one
> go..  Maybe there is a tool that does that, I just don't know about it
> :)
>
> Anyway, I'm all upgraded to 9.2.  Decided I might as well since I'm
> launching my site in 3 weeks, and won't get another chance to upgrade
> for a while..

Oh reading the online docs, it looks like what I may have wanted was:

--format=custom

"Output a custom-format archive suitable for input into pg_restore.
Together with the directory output format, this is the most flexible
output format in that it allows manual selection and reordering of
archived items during restore. This format is also compressed by
default."

I take it this is one big file (with no size limits) that's compressed
over the wire.  Next time I'll try this!

Mike


Re: Bad pg_dump error message

From
Tom Lane
Date:
Mike Christensen <mike@kitchenpc.com> writes:
> Oh reading the online docs, it looks like what I may have wanted was:
> --format=custom

Right.  That does everything tar format does, only better --- the only
thing tar format beats it at is you can disassemble it with tar.  Back
in the day that seemed like a nice thing, open standards and all; but
the 8GB per file limit is starting to get annoying.

            regards, tom lane


Re: Bad pg_dump error message

From
Mike Christensen
Date:
On Mon, Sep 10, 2012 at 10:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Mike Christensen <mike@kitchenpc.com> writes:
>> Oh reading the online docs, it looks like what I may have wanted was:
>> --format=custom
>
> Right.  That does everything tar format does, only better --- the only
> thing tar format beats it at is you can disassemble it with tar.  Back
> in the day that seemed like a nice thing, open standards and all; but
> the 8GB per file limit is starting to get annoying.

Thanks for your help, Tom!

Next time I'll read the docs.  "custom" is kinda a weird name for it,
I think I just kinda glazed over it thinking it was some way to define
my own format or something..

Mike


Re: Bad pg_dump error message

From
Tom Lane
Date:
Mike Christensen <mike@kitchenpc.com> writes:
>> Is the TAR format just the raw SQL commands, just tar'ed and then sent
>> over the wire?

Sorta.  If you pull it apart with tar, you'll find out there's a SQL
script that creates the database schema, and then a separate tar-member
file containing the data for each table.

Custom format goes this a little better by splitting the information up
into a separate archive item for each database object.  But there's no
tool other than pg_restore that knows how to deconstruct custom-format
archives.

            regards, tom lane


Re: Bad pg_dump error message

From
Peter Eisentraut
Date:
On Tue, 2012-09-11 at 01:21 -0400, Tom Lane wrote:
> Mike Christensen <mike@kitchenpc.com> writes:
> > Oh reading the online docs, it looks like what I may have wanted was:
> > --format=custom
>
> Right.  That does everything tar format does, only better --- the only
> thing tar format beats it at is you can disassemble it with tar.  Back
> in the day that seemed like a nice thing, open standards and all; but
> the 8GB per file limit is starting to get annoying.

We could change the tar code to produce POSIX 2001 format archives,
which don't have that limitation.  But if someone wanted to do some work
in this area, it might be more useful to look into a zip-based format.





Re: Bad pg_dump error message

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> We could change the tar code to produce POSIX 2001 format archives,
> which don't have that limitation.  But if someone wanted to do some work
> in this area, it might be more useful to look into a zip-based format.

I find it doubtful that it's worth spending effort on either.  How often
have you heard of someone actually disassembling an archive with tar?
What reason is there to think that zip would be any more useful?  Really
pg_restore is where the action is.

            regards, tom lane


Re: Bad pg_dump error message

From
Jasen Betts
Date:
On 2012-09-11, Mike Christensen <mike@kitchenpc.com> wrote:

> Is the TAR format just the raw SQL commands, just tar'ed and then sent
> over the wire?  It'd be cool if there was some compressed "binary"
> backup of a database that could be easily downloaded, or even better,
> a way to just move an entire database between server instances in one
> go..  Maybe there is a tool that does that, I just don't know about it

you can stream pg_dump any way you like:
directly

pg_dump "connection-string" | psql "sonnection-string"

over ssh,

pg_dump "connection-string" | ssh newserver 'psql "sonnection-string"'

with streaming compression,

pg_dump "connection-string" | bzip2 | ssh newserver 'bunzip2 | psql "sonnection-string"'

or if i don't need an encrypted channel I use netcat for transport
(not shown)

--
⚂⚃ 100% natural