Thread: Problem w/ dumping huge table and no disk space
Help if you would please :) I have a 10million+ row table and I've only got a couple hundred megs left. I can't delete any rows, pg runs out of disk space and crashes. I can't pg_dump w/ compressed, the output file is started, has the schema and a bit other info comprising about 650 bytes, runs for 30 minutes and pg runs out of disk space and crashes. My pg_dump cmd is: "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog". I want to dump this database (entire pgsql dir is just over two gigs) and put it on another larger machine. I can't afford to lose this information, are there any helpful hints? I'll be happy to provide more information if desired. David
On Fri, 7 Sep 2001, David Ford wrote: > Help if you would please :) > > I have a 10million+ row table and I've only got a couple hundred megs > left. I can't delete any rows, pg runs out of disk space and crashes. > I can't pg_dump w/ compressed, the output file is started, has the > schema and a bit other info comprising about 650 bytes, runs for 30 > minutes and pg runs out of disk space and crashes. My pg_dump cmd is: > "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog". Try putting the output into ssh or something similar. You don't have to keep it on the local machine. From the bigger machine, something like ssh server-with-data "pg_dump <options>" > syslog-dump or from the smaller machine, pg_dump <options> | ssh big-machine "cat > syslog-dump" should do the trick. Maybe you can even pipe the output directly into psql or pg_restore. Make sure the pg_dump throws output to stdout. HTH. -- Alvaro Herrera (<alvherre[@]atentus.com>)
David Ford <david@blue-labs.org> writes: > I have a 10million+ row table and I've only got a couple hundred megs > left. I can't delete any rows, pg runs out of disk space and crashes. What is running out of disk space, exactly? If the problem is WAL log growth, an update to 7.1.3 might help (... you didn't say which version you're using). If the problem is lack of space for the pg_dump output file, I think you have little choice except to arrange for the dump to go to another device (maybe dump it across NFS, or to a tape, or something). regards, tom lane
Have you tried dumping individual tables separately until it's all done? I've never used to -Z option, so I can't compare its compression to piping a pg_dump through gzip. However, this is how I've been doing it: pg_dump db_name | gzip -c > db_name.gz I have a 2.2 Gb database that gets dumped/compressed to a 235 Mb file. Andrew --- David Ford <david@blue-labs.org> wrote: > Help if you would please :) > > I have a 10million+ row table and I've only got a > couple hundred megs > left. I can't delete any rows, pg runs out of disk > space and crashes. > I can't pg_dump w/ compressed, the output file is > started, has the > schema and a bit other info comprising about 650 > bytes, runs for 30 > minutes and pg runs out of disk space and crashes. > My pg_dump cmd is: > "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 > syslog". > > I want to dump this database (entire pgsql dir is > just over two gigs) > and put it on another larger machine. > > I can't afford to lose this information, are there > any helpful hints? > > I'll be happy to provide more information if > desired. > > David > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster __________________________________________________ Do You Yahoo!? Get email alerts & NEW webcam video instant messaging with Yahoo! Messenger http://im.yahoo.com
> Have you tried dumping individual tables separately > until it's all done? > > I've never used to -Z option, so I can't compare its > compression to piping a pg_dump through gzip. > However, this is how I've been doing it: > > pg_dump db_name | gzip -c > db_name.gz > > I have a 2.2 Gb database that gets dumped/compressed > to a 235 Mb file. > > Andrew Another idea which you might try is run pg_dumpall from a different host (with ample space) using the -h and -U options. HTH, Joe Usage: pg_dumpall [ options... ] Options: -c, --clean Clean (drop) schema prior to create -g, --globals-only Only dump global objects, no databases -h, --host=HOSTNAME Server host name -p, --port=PORT Server port number -U, --username=NAME Connect as specified database user -W, --password Force password prompts (should happen automatically) Any extra options will be passed to pg_dump. The dump will be written to the standard output.
There is no way to add a temporary hard drive, or mount another drive on another machine, and then dump to that? David Ford wrote: > > Help if you would please :) > > I have a 10million+ row table and I've only got a couple hundred megs > left. I can't delete any rows, pg runs out of disk space and crashes. > I can't pg_dump w/ compressed, the output file is started, has the > schema and a bit other info comprising about 650 bytes, runs for 30 > minutes and pg runs out of disk space and crashes. My pg_dump cmd is: > "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog". > > I want to dump this database (entire pgsql dir is just over two gigs) > and put it on another larger machine. > > I can't afford to lose this information, are there any helpful hints? > > I'll be happy to provide more information if desired. > > David > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
$ postgres --version postgres (PostgreSQL) 7.1beta5 1) If I run pg_dump, it runs for about 20 minutes the aborts abruptly w/ out of memory err, pg_dump is killed by the kernel and postgres spews pipe errors until it reaches the end of the table or I kill it. It starts with ~100megs of regular RAM free and has 300megs of swap. 2) If I try to do a 'delete from ...' query, it runs for about 20 minutes and all of a sudden has 4 megs of disk space free and pg dies. It starts with ~500megs disk space free. So in either situation I'm kind of screwed. The new machine is running 7.2devel, I doubt I could copy the data directory. My WAL logs is set to 8, 8*16 is 128megs, no? Tom Lane wrote: >David Ford <david@blue-labs.org> writes: > >>I have a 10million+ row table and I've only got a couple hundred megs >>left. I can't delete any rows, pg runs out of disk space and crashes. >> > >What is running out of disk space, exactly? > >If the problem is WAL log growth, an update to 7.1.3 might help >(... you didn't say which version you're using). > >If the problem is lack of space for the pg_dump output file, I think you >have little choice except to arrange for the dump to go to another >device (maybe dump it across NFS, or to a tape, or something). > > regards, tom lane >
David Ford <david@blue-labs.org> writes: > $ postgres --version > postgres (PostgreSQL) 7.1beta5 You're still running beta5? (bites tongue ...) Update to 7.1.3. AFAIR you should be able to do that without initdb. That will fix your WAL growth problems and allow you to do the large DELETE you wanted. > 1) If I run pg_dump, it runs for about 20 minutes the aborts abruptly w/ > out of memory err, pg_dump is killed by the kernel and postgres spews > pipe errors until it reaches the end of the table or I kill it. Are you trying to run pg_dump with -d or -D switch? If so, try it without. regards, tom lane
That's why my original intent was :( I had b5 on another machine and it didn't want to upgrade cleanly, thus my attempt to dump/restore. I'm going to try the pg_dump from another machine where I added a gig of swap. It has a base of 256M so I'm hoping for success there. David Tom Lane wrote: >David Ford <david@blue-labs.org> writes: > >>$ postgres --version >>postgres (PostgreSQL) 7.1beta5 >> > >You're still running beta5? (bites tongue ...) > >Update to 7.1.3. AFAIR you should be able to do that without initdb. >That will fix your WAL growth problems and allow you to do the large >DELETE you wanted. > >>1) If I run pg_dump, it runs for about 20 minutes the aborts abruptly w/ >>out of memory err, pg_dump is killed by the kernel and postgres spews >>pipe errors until it reaches the end of the table or I kill it. >> > >Are you trying to run pg_dump with -d or -D switch? If so, try it without. > > regards, tom lane >
David Ford <david@blue-labs.org> writes: > I had b5 on another machine and it didn't want to upgrade cleanly, thus > my attempt to dump/restore. Now that I think about it, I believe you need to run the contrib/pg_resetxlog utility to update from 7.1beta5 to final. If you do that (read its README first!) you should be able to do the update. regards, tom lane
I think I've been successful here :) I ran pg_dump from the new machine and it doesn't suffer the memory blowup the old one does. I gleaned 1.7G of data from it, doing inserts now. _Thank you_ to everyone for their suggestions, this data is really important. I should have tried the new pg_dump in the first place. David Tom Lane wrote: >David Ford <david@blue-labs.org> writes: > >>I had b5 on another machine and it didn't want to upgrade cleanly, thus >>my attempt to dump/restore. >> > >Now that I think about it, I believe you need to run the >contrib/pg_resetxlog utility to update from 7.1beta5 to final. If you >do that (read its README first!) you should be able to do the update. > > regards, tom lane >
david@blue-labs.org (David Ford) wrote in message news:<3B993392.1000809@blue-labs.org>... > Help if you would please :) > > I have a 10million+ row table and I've only got a couple hundred megs > left. I can't delete any rows, pg runs out of disk space and crashes. > I can't pg_dump w/ compressed, the output file is started, has the > schema and a bit other info comprising about 650 bytes, runs for 30 > minutes and pg runs out of disk space and crashes. My pg_dump cmd is: > "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog". > > I want to dump this database (entire pgsql dir is just over two gigs) > and put it on another larger machine. > > I can't afford to lose this information, are there any helpful hints? Do you have ssh available on your computer? Is an sshd daemon running on the other computer? Then try this: pg_dump mydatabase|ssh othersystem.com dd of=/home/me/database.dump The output of pg_dump on your computer will end up on the other computer in /home/me/database.dump. You could even do: pg_dump mydatabase|gzip -c|ssh othersystem.com 'gunzip -c |psql mydatabase' This runs the database dump through gzip, pipes it to ssh - which pipes it through gunzip, then psql. Obviously, you'll need to "createdb mydatabase" on "othersystem.com" before running the above line. I tried this just now, and it works beautifully. If you're doing it across a LAN, you can dispense with the gzip/gunzip bit - you'll lose more bandwidth to CPU usage then you'll gain from the compression (use compression when bandwidth is really limited). Calvin p.s. this can also be done with rsh (remote shell) and the corresponding rsh server if you don't have ssh - but you really _should_ be using ssh.
> > >Do you have ssh available on your computer? Is an sshd daemon running >on the other computer? > >Then try this: > >pg_dump mydatabase|ssh othersystem.com dd of=/home/me/database.dump > >The output of pg_dump on your computer will end up on the other >computer in /home/me/database.dump. > The problem with that was all in that 7.1b had some broken stuff. psql and pg_dump ate huge amounts of memory while storing the data which were eventually killed by the OOM handler. They never got to the point of dumping the data. The solution was to start pg_dump from the new box and connect to the old server, pg_dump was fixed in that one. That worked just fine. Thank you for the suggestion. On a side note (Tom, Bruce, etc), is there some way to mitigate psql's storage of all rows returned in memory? Perhaps a 'swap' file? If you connect to a 1.7G database and issue a query on it that returns a lot of rows, the entire thing is held in memory which with such a query is likely to cause an OOM and get killed. David
On Sat, Sep 08, 2001 at 01:59:54AM -0400, David Ford wrote: > On a side note (Tom, Bruce, etc), is there some way to mitigate psql's > storage of all rows returned in memory? Perhaps a 'swap' file? If you > connect to a 1.7G database and issue a query on it that returns a lot of > rows, the entire thing is held in memory which with such a query is > likely to cause an OOM and get killed. psql does it because that's what the library does. I'm not sure if you can get the library to return before all the rows have transferred. And if you can someone needs to fix psql to use it. pg_dump -d used to have this problem also but I recently submitted a patch so that that doesn't happen anymore. I guess you could arrange for psql to rewrite your SELECT statements to use a cursor but that may be unexpected... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Magnetism, electricity and motion are like a three-for-two special offer: > if you have two of them, the third one comes free.
On 7 Sep 2001, Calvin Dodge wrote: > You could even do: > > pg_dump mydatabase|gzip -c|ssh othersystem.com 'gunzip -c |psql > mydatabase' This is getting off topic, but I just thought I'd mention that pg_dump mydatabase | ssh -C othersystem.com 'psql mydatabase' does effectively the same thing as the above and saves you process creation overhead on both systems. -- Tod McQuillin