Thread: Bad pg_dump error message
Is there something that can be done smarter with this error message? pg_dump: dumping contents of table pages pg_dump: [tar archiver] archive member too large for tar format pg_dump: *** aborted because of error If there's any hard limits (like memory, or RAM) that can be checked before it spends two hours downloading the data, that might be a better experience.. I'd be happy to file a bug.. Thanks!! Mike
On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote: > Is there something that can be done smarter with this error message? > > > pg_dump: dumping contents of table pages > pg_dump: [tar archiver] archive member too large for tar format > pg_dump: *** aborted because of error Maybe it could tell you what the maximum allowed length is, for future reference. > > If there's any hard limits (like memory, or RAM) that can be checked > before it spends two hours downloading the data, There is no efficient way for it to know for certain in advance how much space the data will take, until it has seen the data. Perhaps it could make an estimate, but that could suffer from both false positives and false negatives. The docs for pg_dump do mention it has a 8GB limit for individual tables. I don't see how much more than that warning can reasonably be done. It looks like it dumps an entire table to a temp file first, so I guess it could throw the error at the point the temp file exceeds that size, rather than waiting for the table to be completely dumped and then attempted to be added to the archive. But that would break modularity some, and still you could have dumped 300 7.5GB tables before getting to that 8.5GB one which causes the error. Cheers, Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote: >> Is there something that can be done smarter with this error message? >> >> pg_dump: dumping contents of table pages >> pg_dump: [tar archiver] archive member too large for tar format >> pg_dump: *** aborted because of error > There is no efficient way for it to know for certain in advance how > much space the data will take, until it has seen the data. Perhaps it > could make an estimate, but that could suffer from both false > positives and false negatives. Maybe the docs should warn people away from tar format more vigorously. Unless you actually have a reason to disassemble the archive with tar, that format has no redeeming social value that I can see, and it definitely has gotchas. (This isn't the only one, IIRC.) regards, tom lane
On Mon, Sep 10, 2012 at 9:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote: >>> Is there something that can be done smarter with this error message? >>> >>> pg_dump: dumping contents of table pages >>> pg_dump: [tar archiver] archive member too large for tar format >>> pg_dump: *** aborted because of error > >> There is no efficient way for it to know for certain in advance how >> much space the data will take, until it has seen the data. Perhaps it >> could make an estimate, but that could suffer from both false >> positives and false negatives. > > Maybe the docs should warn people away from tar format more vigorously. > Unless you actually have a reason to disassemble the archive with tar, > that format has no redeeming social value that I can see, and it > definitely has gotchas. (This isn't the only one, IIRC.) Gotcha. I ended up just using "plain" format which worked well, even though the file was about 60 gigs and I had to clear out some hard disk space first. Is the TAR format just the raw SQL commands, just tar'ed and then sent over the wire? It'd be cool if there was some compressed "binary" backup of a database that could be easily downloaded, or even better, a way to just move an entire database between server instances in one go.. Maybe there is a tool that does that, I just don't know about it :) Anyway, I'm all upgraded to 9.2. Decided I might as well since I'm launching my site in 3 weeks, and won't get another chance to upgrade for a while.. Mike
On Mon, Sep 10, 2012 at 10:06 PM, Mike Christensen <mike@kitchenpc.com> wrote: > On Mon, Sep 10, 2012 at 9:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Jeff Janes <jeff.janes@gmail.com> writes: >>> On Mon, Sep 10, 2012 at 5:27 PM, Mike Christensen <mike@kitchenpc.com> wrote: >>>> Is there something that can be done smarter with this error message? >>>> >>>> pg_dump: dumping contents of table pages >>>> pg_dump: [tar archiver] archive member too large for tar format >>>> pg_dump: *** aborted because of error >> >>> There is no efficient way for it to know for certain in advance how >>> much space the data will take, until it has seen the data. Perhaps it >>> could make an estimate, but that could suffer from both false >>> positives and false negatives. >> >> Maybe the docs should warn people away from tar format more vigorously. >> Unless you actually have a reason to disassemble the archive with tar, >> that format has no redeeming social value that I can see, and it >> definitely has gotchas. (This isn't the only one, IIRC.) > > Gotcha. I ended up just using "plain" format which worked well, even > though the file was about 60 gigs and I had to clear out some hard > disk space first. > > Is the TAR format just the raw SQL commands, just tar'ed and then sent > over the wire? It'd be cool if there was some compressed "binary" > backup of a database that could be easily downloaded, or even better, > a way to just move an entire database between server instances in one > go.. Maybe there is a tool that does that, I just don't know about it > :) > > Anyway, I'm all upgraded to 9.2. Decided I might as well since I'm > launching my site in 3 weeks, and won't get another chance to upgrade > for a while.. Oh reading the online docs, it looks like what I may have wanted was: --format=custom "Output a custom-format archive suitable for input into pg_restore. Together with the directory output format, this is the most flexible output format in that it allows manual selection and reordering of archived items during restore. This format is also compressed by default." I take it this is one big file (with no size limits) that's compressed over the wire. Next time I'll try this! Mike
Mike Christensen <mike@kitchenpc.com> writes: > Oh reading the online docs, it looks like what I may have wanted was: > --format=custom Right. That does everything tar format does, only better --- the only thing tar format beats it at is you can disassemble it with tar. Back in the day that seemed like a nice thing, open standards and all; but the 8GB per file limit is starting to get annoying. regards, tom lane
On Mon, Sep 10, 2012 at 10:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Mike Christensen <mike@kitchenpc.com> writes: >> Oh reading the online docs, it looks like what I may have wanted was: >> --format=custom > > Right. That does everything tar format does, only better --- the only > thing tar format beats it at is you can disassemble it with tar. Back > in the day that seemed like a nice thing, open standards and all; but > the 8GB per file limit is starting to get annoying. Thanks for your help, Tom! Next time I'll read the docs. "custom" is kinda a weird name for it, I think I just kinda glazed over it thinking it was some way to define my own format or something.. Mike
Mike Christensen <mike@kitchenpc.com> writes: >> Is the TAR format just the raw SQL commands, just tar'ed and then sent >> over the wire? Sorta. If you pull it apart with tar, you'll find out there's a SQL script that creates the database schema, and then a separate tar-member file containing the data for each table. Custom format goes this a little better by splitting the information up into a separate archive item for each database object. But there's no tool other than pg_restore that knows how to deconstruct custom-format archives. regards, tom lane
On Tue, 2012-09-11 at 01:21 -0400, Tom Lane wrote: > Mike Christensen <mike@kitchenpc.com> writes: > > Oh reading the online docs, it looks like what I may have wanted was: > > --format=custom > > Right. That does everything tar format does, only better --- the only > thing tar format beats it at is you can disassemble it with tar. Back > in the day that seemed like a nice thing, open standards and all; but > the 8GB per file limit is starting to get annoying. We could change the tar code to produce POSIX 2001 format archives, which don't have that limitation. But if someone wanted to do some work in this area, it might be more useful to look into a zip-based format.
Peter Eisentraut <peter_e@gmx.net> writes: > We could change the tar code to produce POSIX 2001 format archives, > which don't have that limitation. But if someone wanted to do some work > in this area, it might be more useful to look into a zip-based format. I find it doubtful that it's worth spending effort on either. How often have you heard of someone actually disassembling an archive with tar? What reason is there to think that zip would be any more useful? Really pg_restore is where the action is. regards, tom lane
On 2012-09-11, Mike Christensen <mike@kitchenpc.com> wrote: > Is the TAR format just the raw SQL commands, just tar'ed and then sent > over the wire? It'd be cool if there was some compressed "binary" > backup of a database that could be easily downloaded, or even better, > a way to just move an entire database between server instances in one > go.. Maybe there is a tool that does that, I just don't know about it you can stream pg_dump any way you like: directly pg_dump "connection-string" | psql "sonnection-string" over ssh, pg_dump "connection-string" | ssh newserver 'psql "sonnection-string"' with streaming compression, pg_dump "connection-string" | bzip2 | ssh newserver 'bunzip2 | psql "sonnection-string"' or if i don't need an encrypted channel I use netcat for transport (not shown) -- ⚂⚃ 100% natural