Thread: WAL archiving and backup TAR
Hello,
I'm implementing WAL archiving and PITR on my production DB.
I've set up my TAR, WAL archives and pg_xlog all to be store on a separate disk then my DB.
I'm at the point where i'm running 'Select pg_start_backup('xxx');'.
Here's the command i've run for my tar:
time tar -czf /pbo/podbackuprecovery/tars/pod-backup-${CURRDATE}.tar.gz /pbo/pod > /pbo/podbackuprecovery/pitr_logs/backup-tar-log-${CURRDATE}.log 2>&1
The problem is that this tar took just over 25 hours to complete. I expected this to be a long process because since my DB is about 100 gigs.
But 25hrs seems a bit too long. Does anyone have any ideas how to cut down on this time?
Are there limitations to tar or gzip related to the size i'm working with, or perhaps as a colleague suggested, tar/zip is a single thread process and it may be bottlenecking one CPU (we run multiple core). When I run top, gzip is running at about 12% of the CPU and tar is around .4%. which adds up to 1/8 of 100% CPU, which number wise one full CPU on our server since we have 8.
After making the .conf file configurations I restarted my DB and allowed normal transactions while I do the tar/zip.
Your help is very much appreciated.
--Dom Torrez
torrez wrote: > The problem is that this tar took just over 25 hours to complete. I > expected this to be a long process because since my DB is about 100 > gigs. > But 25hrs seems a bit too long. Does anyone have any ideas how to cut > down on this time? Don't gzip it online? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Freitag 19 Juni 2009 torrez wrote: > time tar -czf /pbo/podbackuprecovery/tars/pod-backup-$ > {CURRDATE}.tar.gz /pbo/pod > /pbo/podbackuprecovery/pitr_logs/backup- > tar-log-${CURRDATE}.log 2>&1 If you have a multi-core/multi-CPU machine, try to used pbzip2 (parallel bzip2), which can use all CPU cores at the same time for compression. The simplest might be tar cf backup.tar ..... (first the tar without compression to finish quickly) pbzip2 backup.tar mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
On Fri, 19 Jun 2009 09:43:28 -0600 torrez <torrez@unavco.org> wrote: > Hello, > I'm implementing WAL archiving and PITR on my production DB. > I've set up my TAR, WAL archives and pg_xlog all to be store on a > separate disk then my DB. > I'm at the point where i'm running 'Select pg_start_backup('xxx');'. > > Here's the command i've run for my tar: > > time tar -czf /pbo/podbackuprecovery/tars/pod-backup-$ > {CURRDATE}.tar.gz /pbo/pod > /pbo/podbackuprecovery/pitr_logs/backup- > tar-log-${CURRDATE}.log 2>&1 > > The problem is that this tar took just over 25 hours to complete. I > expected this to be a long process because since my DB is about 100 > gigs. > But 25hrs seems a bit too long. Does anyone have any ideas how to > cut down on this time? > > Are there limitations to tar or gzip related to the size i'm working > with, or perhaps as a colleague suggested, tar/zip is a single > thread process and it may be bottlenecking one CPU (we run multiple > core). When I run top, gzip is running at about 12% of the CPU and > tar is around .4%. which adds up to 1/8 of 100% CPU, which number > wise one full CPU on our server since we have 8. > > After making the .conf file configurations I restarted my DB and > allowed normal transactions while I do the tar/zip. > > Your help is very much appreciated. Transfer it first and compress later. We have production db of around 170GB's and backup is around 2h to Tivoli Storage Manager server via ethernet (to IBM tape library). I would not prefer bzip over gzip, because it is less tested, and generaly you don't want your backup archive to have even minor sight of a possible doubt.... Production environment maybe, but backup never... -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ |
On Tue, Jun 23, 2009 at 10:18:30PM +0200, Jakov Sosic wrote: > On Fri, 19 Jun 2009 09:43:28 -0600 > torrez <torrez@unavco.org> wrote: > > > Hello, > > I'm implementing WAL archiving and PITR on my production DB. > > I've set up my TAR, WAL archives and pg_xlog all to be store on a > > separate disk then my DB. > > I'm at the point where i'm running 'Select pg_start_backup('xxx');'. > > > > Here's the command i've run for my tar: > > > > time tar -czf /pbo/podbackuprecovery/tars/pod-backup-$ > > {CURRDATE}.tar.gz /pbo/pod > /pbo/podbackuprecovery/pitr_logs/backup- > > tar-log-${CURRDATE}.log 2>&1 > > > > The problem is that this tar took just over 25 hours to complete. I > > expected this to be a long process because since my DB is about 100 > > gigs. > > But 25hrs seems a bit too long. Does anyone have any ideas how to > > cut down on this time? > > > > Are there limitations to tar or gzip related to the size i'm working > > with, or perhaps as a colleague suggested, tar/zip is a single > > thread process and it may be bottlenecking one CPU (we run multiple > > core). When I run top, gzip is running at about 12% of the CPU and > > tar is around .4%. which adds up to 1/8 of 100% CPU, which number > > wise one full CPU on our server since we have 8. > > > > After making the .conf file configurations I restarted my DB and > > allowed normal transactions while I do the tar/zip. > > > > Your help is very much appreciated. > > Transfer it first and compress later. We have production db of around > 170GB's and backup is around 2h to Tivoli Storage Manager server via > ethernet (to IBM tape library). > > I would not prefer bzip over gzip, because it is less tested, and > generaly you don't want your backup archive to have even minor sight of > a possible doubt.... Production environment maybe, but backup never... > +1 The gzip step is holding up the copy the most. Another thing that might be worth trying is the "star" program. It can use a shared memory buffer to allow very rapid archiving. Cheers, Ken