Thread: Problem w/ dumping huge table and no disk space

Problem w/ dumping huge table and no disk space

From
David Ford
Date:
Help if you would please :)

I have a 10million+ row table and I've only got a couple hundred megs
left.  I can't delete any rows, pg runs out of disk space and crashes.
 I can't pg_dump w/ compressed, the output file is started, has the
schema and a bit other info comprising about 650 bytes, runs for 30
minutes and pg runs out of disk space and crashes.  My pg_dump cmd is:
"pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog".

I want to dump this database (entire pgsql dir is just over two gigs)
and put it on another larger machine.

I can't afford to lose this information, are there any helpful hints?

I'll be happy to provide more information if desired.

David



Re: Problem w/ dumping huge table and no disk space

From
Alvaro Herrera
Date:
On Fri, 7 Sep 2001, David Ford wrote:

> Help if you would please :)
>
> I have a 10million+ row table and I've only got a couple hundred megs
> left.  I can't delete any rows, pg runs out of disk space and crashes.
>  I can't pg_dump w/ compressed, the output file is started, has the
> schema and a bit other info comprising about 650 bytes, runs for 30
> minutes and pg runs out of disk space and crashes.  My pg_dump cmd is:
> "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog".

Try putting the output into ssh or something similar. You don't have to
keep it on the local machine.

From the bigger machine, something like

ssh server-with-data "pg_dump <options>" > syslog-dump

or from the smaller machine,
pg_dump <options> | ssh big-machine "cat > syslog-dump"

should do the trick. Maybe you can even pipe the output directly into
psql or pg_restore. Make sure the pg_dump throws output to stdout.

HTH.

--
Alvaro Herrera (<alvherre[@]atentus.com>)


Re: Problem w/ dumping huge table and no disk space

From
Tom Lane
Date:
David Ford <david@blue-labs.org> writes:
> I have a 10million+ row table and I've only got a couple hundred megs
> left.  I can't delete any rows, pg runs out of disk space and crashes.

What is running out of disk space, exactly?

If the problem is WAL log growth, an update to 7.1.3 might help
(... you didn't say which version you're using).

If the problem is lack of space for the pg_dump output file, I think you
have little choice except to arrange for the dump to go to another
device (maybe dump it across NFS, or to a tape, or something).

            regards, tom lane

Re: Problem w/ dumping huge table and no disk space

From
Andrew Gould
Date:
Have you tried dumping individual tables separately
until it's all done?

I've never used to -Z option, so I can't compare its
compression to piping a pg_dump through gzip.
However, this is how I've been doing it:

pg_dump db_name | gzip -c > db_name.gz

I have a 2.2 Gb database that gets dumped/compressed
to a 235 Mb file.

Andrew

--- David Ford <david@blue-labs.org> wrote:
> Help if you would please :)
>
> I have a 10million+ row table and I've only got a
> couple hundred megs
> left.  I can't delete any rows, pg runs out of disk
> space and crashes.
>  I can't pg_dump w/ compressed, the output file is
> started, has the
> schema and a bit other info comprising about 650
> bytes, runs for 30
> minutes and pg runs out of disk space and crashes.
> My pg_dump cmd is:
> "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9
> syslog".
>
> I want to dump this database (entire pgsql dir is
> just over two gigs)
> and put it on another larger machine.
>
> I can't afford to lose this information, are there
> any helpful hints?
>
> I'll be happy to provide more information if
> desired.
>
> David
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


__________________________________________________
Do You Yahoo!?
Get email alerts & NEW webcam video instant messaging with Yahoo! Messenger
http://im.yahoo.com

Re: Problem w/ dumping huge table and no disk space

From
"Joe Conway"
Date:
> Have you tried dumping individual tables separately
> until it's all done?
>
> I've never used to -Z option, so I can't compare its
> compression to piping a pg_dump through gzip.
> However, this is how I've been doing it:
>
> pg_dump db_name | gzip -c > db_name.gz
>
> I have a 2.2 Gb database that gets dumped/compressed
> to a 235 Mb file.
>
> Andrew

Another idea which you might try is run pg_dumpall from a different host
(with ample space) using the -h and -U options.

HTH,

Joe

Usage:
  pg_dumpall [ options... ]

Options:
  -c, --clean            Clean (drop) schema prior to create
  -g, --globals-only     Only dump global objects, no databases
  -h, --host=HOSTNAME    Server host name
  -p, --port=PORT        Server port number
  -U, --username=NAME    Connect as specified database user
  -W, --password         Force password prompts (should happen
automatically)
Any extra options will be passed to pg_dump.  The dump will be written
to the standard output.



Re: Problem w/ dumping huge table and no disk space

From
Brett Schwarz
Date:
There is no way to add a temporary hard drive, or mount another drive on
another machine, and then dump to that?

David Ford wrote:
>
> Help if you would please :)
>
> I have a 10million+ row table and I've only got a couple hundred megs
> left.  I can't delete any rows, pg runs out of disk space and crashes.
>  I can't pg_dump w/ compressed, the output file is started, has the
> schema and a bit other info comprising about 650 bytes, runs for 30
> minutes and pg runs out of disk space and crashes.  My pg_dump cmd is:
> "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog".
>
> I want to dump this database (entire pgsql dir is just over two gigs)
> and put it on another larger machine.
>
> I can't afford to lose this information, are there any helpful hints?
>
> I'll be happy to provide more information if desired.
>
> David
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: Problem w/ dumping huge table and no disk space

From
David Ford
Date:
$ postgres --version
postgres (PostgreSQL) 7.1beta5

1) If I run pg_dump, it runs for about 20 minutes the aborts abruptly w/
out of memory err, pg_dump is killed by the kernel and postgres spews
pipe errors until it reaches the end of the table or I kill it.  It
starts with ~100megs of regular RAM free and has 300megs of swap.

2) If I try to do a 'delete from ...' query, it runs for about 20
minutes and all of a sudden has 4 megs of disk space free and pg dies.
 It starts with ~500megs disk space free.

So in either situation I'm kind of screwed.  The new machine is running
7.2devel, I doubt I could copy the data directory.

My WAL logs is set to 8, 8*16 is 128megs, no?

Tom Lane wrote:

>David Ford <david@blue-labs.org> writes:
>
>>I have a 10million+ row table and I've only got a couple hundred megs
>>left.  I can't delete any rows, pg runs out of disk space and crashes.
>>
>
>What is running out of disk space, exactly?
>
>If the problem is WAL log growth, an update to 7.1.3 might help
>(... you didn't say which version you're using).
>
>If the problem is lack of space for the pg_dump output file, I think you
>have little choice except to arrange for the dump to go to another
>device (maybe dump it across NFS, or to a tape, or something).
>
>            regards, tom lane
>



Re: Problem w/ dumping huge table and no disk space

From
Tom Lane
Date:
David Ford <david@blue-labs.org> writes:
> $ postgres --version
> postgres (PostgreSQL) 7.1beta5

You're still running beta5?  (bites tongue ...)

Update to 7.1.3.  AFAIR you should be able to do that without initdb.
That will fix your WAL growth problems and allow you to do the large
DELETE you wanted.

> 1) If I run pg_dump, it runs for about 20 minutes the aborts abruptly w/
> out of memory err, pg_dump is killed by the kernel and postgres spews
> pipe errors until it reaches the end of the table or I kill it.

Are you trying to run pg_dump with -d or -D switch?  If so, try it without.

            regards, tom lane

Re: Problem w/ dumping huge table and no disk space

From
David Ford
Date:
That's why my original intent was :(

I had b5 on another machine and it didn't want to upgrade cleanly, thus
my attempt to dump/restore.

I'm going to try the pg_dump from another machine where I added a gig of
swap.  It has a base of 256M so I'm hoping for success there.

David

Tom Lane wrote:

>David Ford <david@blue-labs.org> writes:
>
>>$ postgres --version
>>postgres (PostgreSQL) 7.1beta5
>>
>
>You're still running beta5?  (bites tongue ...)
>
>Update to 7.1.3.  AFAIR you should be able to do that without initdb.
>That will fix your WAL growth problems and allow you to do the large
>DELETE you wanted.
>
>>1) If I run pg_dump, it runs for about 20 minutes the aborts abruptly w/
>>out of memory err, pg_dump is killed by the kernel and postgres spews
>>pipe errors until it reaches the end of the table or I kill it.
>>
>
>Are you trying to run pg_dump with -d or -D switch?  If so, try it without.
>
>            regards, tom lane
>



Re: Problem w/ dumping huge table and no disk space

From
Tom Lane
Date:
David Ford <david@blue-labs.org> writes:
> I had b5 on another machine and it didn't want to upgrade cleanly, thus
> my attempt to dump/restore.

Now that I think about it, I believe you need to run the
contrib/pg_resetxlog utility to update from 7.1beta5 to final.  If you
do that (read its README first!) you should be able to do the update.

            regards, tom lane

Re: Problem w/ dumping huge table and no disk space

From
David Ford
Date:
I think I've been successful here :)

I ran pg_dump from the new machine and it doesn't suffer the memory
blowup the old one does.  I gleaned 1.7G of data from it, doing inserts now.

_Thank you_ to everyone for their suggestions, this data is really
important.  I should have tried the new pg_dump in the first place.

David

Tom Lane wrote:

>David Ford <david@blue-labs.org> writes:
>
>>I had b5 on another machine and it didn't want to upgrade cleanly, thus
>>my attempt to dump/restore.
>>
>
>Now that I think about it, I believe you need to run the
>contrib/pg_resetxlog utility to update from 7.1beta5 to final.  If you
>do that (read its README first!) you should be able to do the update.
>
>            regards, tom lane
>



Re: Problem w/ dumping huge table and no disk space

From
caldodge@fpcc.net (Calvin Dodge)
Date:
david@blue-labs.org (David Ford) wrote in message news:<3B993392.1000809@blue-labs.org>...
> Help if you would please :)
>
> I have a 10million+ row table and I've only got a couple hundred megs
> left.  I can't delete any rows, pg runs out of disk space and crashes.
>  I can't pg_dump w/ compressed, the output file is started, has the
> schema and a bit other info comprising about 650 bytes, runs for 30
> minutes and pg runs out of disk space and crashes.  My pg_dump cmd is:
> "pg_dump -d -f syslog.tar.gz -F c -t syslog -Z 9 syslog".
>
> I want to dump this database (entire pgsql dir is just over two gigs)
> and put it on another larger machine.
>
> I can't afford to lose this information, are there any helpful hints?

Do you have ssh available on your computer?  Is an sshd daemon running
on the other computer?

Then try this:

pg_dump mydatabase|ssh othersystem.com dd of=/home/me/database.dump

The output of pg_dump on your computer will end up on the other
computer in /home/me/database.dump.

You could even do:

pg_dump mydatabase|gzip -c|ssh othersystem.com 'gunzip -c |psql
mydatabase'

This runs the database dump through gzip, pipes it to ssh - which
pipes it through gunzip, then psql. Obviously, you'll need to
"createdb mydatabase" on "othersystem.com" before running the above
line.

I tried this just now, and it works beautifully.

If you're doing it across a LAN, you can dispense with the gzip/gunzip
bit - you'll lose more bandwidth to CPU usage then you'll gain from
the compression (use compression when bandwidth is really limited).

Calvin
p.s. this can also be done with rsh (remote shell) and the
corresponding rsh server if you don't have ssh - but you really
_should_ be using ssh.

Re: Problem w/ dumping huge table and no disk space

From
David Ford
Date:
>
>
>Do you have ssh available on your computer?  Is an sshd daemon running
>on the other computer?
>
>Then try this:
>
>pg_dump mydatabase|ssh othersystem.com dd of=/home/me/database.dump
>
>The output of pg_dump on your computer will end up on the other
>computer in /home/me/database.dump.
>

The problem with that was all in that 7.1b had some broken stuff.  psql
and pg_dump ate huge amounts of memory while storing the data which were
eventually killed by the OOM handler.  They never got to the point of
dumping the data.  The solution was to start pg_dump from the new box
and connect to the old server, pg_dump was fixed in that one.  That
worked just fine.

Thank you for the suggestion.

On a side note (Tom, Bruce, etc), is there some way to mitigate psql's
storage of all rows returned in memory?  Perhaps a 'swap' file?  If you
connect to a 1.7G database and issue a query on it that returns a lot of
rows, the entire thing is held in memory which with such a query is
likely to cause an OOM and get killed.

David



Re: Problem w/ dumping huge table and no disk space

From
Martijn van Oosterhout
Date:
On Sat, Sep 08, 2001 at 01:59:54AM -0400, David Ford wrote:
> On a side note (Tom, Bruce, etc), is there some way to mitigate psql's
> storage of all rows returned in memory?  Perhaps a 'swap' file?  If you
> connect to a 1.7G database and issue a query on it that returns a lot of
> rows, the entire thing is held in memory which with such a query is
> likely to cause an OOM and get killed.

psql does it because that's what the library does. I'm not sure if you can
get the library to return before all the rows have transferred. And if you
can someone needs to fix psql to use it.

pg_dump -d used to have this problem also but I recently submitted a patch
so that that doesn't happen anymore. I guess you could arrange for psql to
rewrite your SELECT statements to use a cursor but that may be unexpected...

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> Magnetism, electricity and motion are like a three-for-two special offer:
> if you have two of them, the third one comes free.

Re: Problem w/ dumping huge table and no disk space

From
Tod McQuillin
Date:
On 7 Sep 2001, Calvin Dodge wrote:

> You could even do:
>
> pg_dump mydatabase|gzip -c|ssh othersystem.com 'gunzip -c |psql
> mydatabase'

This is getting off topic, but I just thought I'd mention that

pg_dump mydatabase | ssh -C othersystem.com 'psql mydatabase'

does effectively the same thing as the above and saves you process
creation overhead on both systems.
--
Tod McQuillin