Thread: Backup is too slow
Hi all, I'm a bit unhappy with the time it takes to do backup of my PG7.4.6 base. I have 13GB under the pg/data dir and it takes 30 minutes to do the backup. Using top and iostat I've figured out that the backup job is cpu bound in the postmaster process. It eats up 95% cpu while the disk is at 10% load. In fact I'm able to compress the backup file (using gzip) faster (35 % cpu load) than the backend can deliver it. The operating requirements is 24/7 so I can't just take the base offline and do a file copy. I can do backup that way in 5-6 minutes BTW. Would it speed up the process if I did a binary backup instead ? Are there any other fun tricks to speed up things ? I run on a four way Linux box and it's not in production yet so there is no cpu shortage. The backup script is: #! /bin/sh if test $# -lt 2; then echo "Usage: dbbackup <basename> <filename>" else /home/postgres/postgresql/bin/pg_dump -h <hostname> $1 | gzip -f - | split --bytes 500m - $2. fi And the restore script: #! /bin/sh if test $# -lt 2; then echo "Usage: dbrestore <basename> <filename>" else cat $2.* | gzip -d -f - | /home/postgres/postgresql/bin/psql -h <hostname> -f - $1 fi Cheers, John
CPU may be thottled because it's performing the backup, gzip and split all at once. May I suggest this. /home/postgres/postgresql/bin/pg_dump -h <hostname> --compress=9 -f dumpfile.gz $1 split --bytes 500m dumpfile.gz dumpfile.gz. If that takes too long or clobbers the system... /home/postgres/postgresql/bin/pg_dump -h <hostname> -f dumpfile $1 gzip -9 dumpfile.gz split --bytes 500m dumpfile.gz dumpfile.gz. Another variation may be the same as above except scp/rcp/ftp the uncompressed dump to another idle server that performs the compress and split for you. One last way is to take a filesystem snapshot if your filesystem permits it. Since postgres stops/starts so nicely, we offline ours when it's idle and just long enough to execute the filesystem snapshot then bring it back online immediately after. I suppose you could, in theory, wait till idle and request a lock on all necessary tables, perform a checkpoint, filesystem snapshot, then release the locks. I'm sure Tom, Josh or someone more in the know would have imput for this option. Greg -----Original Message----- From: John Jensen [mailto:JRJ@ft.fo] Sent: Tuesday, December 07, 2004 6:48 AM To: pgsql-admin@postgresql.org Subject: [ADMIN] Backup is too slow Hi all, I'm a bit unhappy with the time it takes to do backup of my PG7.4.6 base. I have 13GB under the pg/data dir and it takes 30 minutes to do the backup. Using top and iostat I've figured out that the backup job is cpu bound in the postmaster process. It eats up 95% cpu while the disk is at 10% load. In fact I'm able to compress the backup file (using gzip) faster (35 % cpu load) than the backend can deliver it. The operating requirements is 24/7 so I can't just take the base offline and do a file copy. I can do backup that way in 5-6 minutes BTW. Would it speed up the process if I did a binary backup instead ? Are there any other fun tricks to speed up things ? I run on a four way Linux box and it's not in production yet so there is no cpu shortage. The backup script is: #! /bin/sh if test $# -lt 2; then echo "Usage: dbbackup <basename> <filename>" else /home/postgres/postgresql/bin/pg_dump -h <hostname> $1 | gzip -f - | split --bytes 500m - $2. fi And the restore script: #! /bin/sh if test $# -lt 2; then echo "Usage: dbrestore <basename> <filename>" else cat $2.* | gzip -d -f - | /home/postgres/postgresql/bin/psql -h <hostname> -f - $1 fi Cheers, John ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Hi Greg & others. I run this on a 4 cpu smp box (Dell PE6650+EMC AX100) so I already offload pg_dump, gzip and split to other cpu's. Top confirms this: postmaster = 95% cpu ie. it uses one cpu completely. Unless I can get postmaster to do less work (that's what I'm looking for) or run multiple threads (not likely) that's about the best I can get. The job is clearly cpu bound in the postmaster process. I'm a bit reluctant to go into the snapshot option You outline. It looks a bit tricky but if no other options are on hand then I'll have to bite the bullet. /John >>> "Spiegelberg, Greg" <gspiegelberg@cranel.com> 07-12-2004 14:33:38 >>> CPU may be thottled because it's performing the backup, gzip and split all at once. < stuff deleted> One last way is to take a filesystem snapshot if your filesystem permits it. Since postgres stops/starts so nicely, we offline ours when it's idle and just long enough to execute the filesystem snapshot then bring it back online immediately after. I suppose you could, in theory, wait till idle and request a lock on all necessary tables, perform a checkpoint, filesystem snapshot, then release the locks. I'm sure Tom, Josh or someone more in the know would have imput for this option. Greg -----Original Message----- From: John Jensen [mailto:JRJ@ft.fo] Sent: Tuesday, December 07, 2004 6:48 AM To: pgsql-admin@postgresql.org Subject: [ADMIN] Backup is too slow Hi all, I'm a bit unhappy with the time it takes to do backup of my PG7.4.6 base. I have 13GB under the pg/data dir and it takes 30 minutes to do the backup. Using top and iostat I've figured out that the backup job is cpu bound in the postmaster process. It eats up 95% cpu while the disk is at 10% load. In fact I'm able to compress the backup file (using gzip) faster (35 % cpu load) than the backend can deliver it. The operating requirements is 24/7 so I can't just take the base offline and do a file copy. I can do backup that way in 5-6 minutes BTW. Would it speed up the process if I did a binary backup instead ? Are there any other fun tricks to speed up things ? I run on a four way Linux box and it's not in production yet so there is no cpu shortage. The backup script is: #! /bin/sh if test $# -lt 2; then echo "Usage: dbbackup <basename> <filename>" else /home/postgres/postgresql/bin/pg_dump -h <hostname> $1 | gzip -f - | split --bytes 500m - $2. fi And the restore script: #! /bin/sh if test $# -lt 2; then echo "Usage: dbrestore <basename> <filename>" else cat $2.* | gzip -d -f - | /home/postgres/postgresql/bin/psql -h <hostname> -f - $1 fi Cheers, John ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
John Jensen wrote: > Hi Greg & others. > I run this on a 4 cpu smp box (Dell PE6650+EMC AX100) so I already > offload pg_dump, gzip and split to other cpu's. Top confirms this: > postmaster = 95% cpu ie. it uses one cpu completely. Unless I can get > postmaster to do less work (that's what I'm looking for) or run multiple > threads (not likely) that's about the best I can get. > > The job is clearly cpu bound in the postmaster process. Hmmm, when I upgraded my Opteron box to 64-bit linux, my dump->gzip ran twice as fast which told me the gzip was a bit part of the CPU usage. Dunno what else you can do to make it run faster. My backups -- even on 64-bit -- still take 20 minutes on a 30GB DB. > > I'm a bit reluctant to go into the snapshot option You outline. It > looks a bit tricky but if no other options are on hand then I'll have to > bite the bullet. Snapshot is much easier if you use LVM. No need to do any postgres trickery. Just freeze the volume at the kernel level.
"John Jensen" <JRJ@ft.fo> writes: > The job is clearly cpu bound in the postmaster process. Which part of the dump process is CPU bound --- dumping schema, or data? (Try enabling log_statement for the pg_dump run and correlating the appearance of queries in the postmaster log with the CPU usage.) If it's schema-bound, maybe you need to vacuum/analyze your system catalogs a bit more aggressively. If it's data-bound, I'm not sure what you can do other than switch to datatypes that are cheaper to convert to text form. It would be interesting to find out where the problem is, though, in case there's something we can fix for future releases. regards, tom lane