Thread: Backup and Restore mechanism in Postgres
Hi, I have a 4 proc. AMD Opteron machine with 32 GB RAM and ~400GB HDD and a 40GB database. I need to take backup of this database and restore it some other location (say some test environment). I am currently using pg_dump and pg_restore utilities to get this done which takes 4-5 hrs for a dump and 8-9 hrs for restore respectively. I am using custom format for taking dumps. I want this time to be reduced as much as possible since 8-9 hrs for restore is definitely not desirable and I might have to do it 3-4 times a day. I have already tuned postgres to set maintenance_work_mem = 1000000 (I guess other settings do not effect db backup and restore). One thing I tried was to take a backup of the complete PGDATA directory and untar it to a seperate location. The problem comes in case of tablespaces which will now point to the same location as the place from where the dump was taken. I tried modifying tablespace of the retsored db to set it to some other location but it seems to change it for the original db as well. So, this won't work. Note: I need to take into account tablespaces that can be distributed across different partitions. I cannot modify tablespaces in the database from which a dump is generated to set them to default tablespace, copy PGDATA dir and work from there on. (as a requirement orig. db should not be modified in any way) Is there some other way to get fast db backups and restore? Major bottleneck here is CPU usage where CPU usage gets to 100% for one of the processors. Is there a way to distribute dump and restore across all the processors. I also looked at Incremental backups but could not see anything on how to restore them to some other location. Every where backup is combined with recovery and not restore. Regards, Vinita Bansal _________________________________________________________________ Special offer for NRIs! http://creative.mediaturf.net/creatives/citibankrca/rca_msntagofline.htm Get a zero balance account for next 20 years. From Citibank
On Sep 14, 2005, at 9:45 AM, vinita bansal wrote: > I have a 4 proc. AMD Opteron machine with 32 GB RAM and ~400GB HDD > and a 40GB database. I need to take backup of this database and > restore it some other location (say some test environment). I am > currently using pg_dump and pg_restore utilities to get this done > which takes 4-5 hrs for a dump and 8-9 hrs for restore > respectively. I am using custom format for taking dumps. > i'll bet you've saturated your disk I/O bandwidth, since for me dumping a db a bit larger than that takes roughly 1 hour, and restore about 4. you could also investigate making a copy using a replication system like slony (http://slony.info) then once the copy is made turning off the replication. Vivek Khera, Ph.D. +1-301-869-4449 x806
At 10:00 AM 9/20/2005 -0400, Vivek Khera wrote: >On Sep 14, 2005, at 9:45 AM, vinita bansal wrote: > >>I have a 4 proc. AMD Opteron machine with 32 GB RAM and ~400GB HDD >>and a 40GB database. I need to take backup of this database and >>restore it some other location (say some test environment). I am >>currently using pg_dump and pg_restore utilities to get this done >>which takes 4-5 hrs for a dump and 8-9 hrs for restore >>respectively. I am using custom format for taking dumps. > >i'll bet you've saturated your disk I/O bandwidth, since for me >dumping a db a bit larger than that takes roughly 1 hour, and restore >about 4. I don't think disk I/O is saturated, unless the database is very fragmented on disk. Most modern drives can manage at least 40MB/sec sequential throughput. Even random seeks of 64KB or 128KB blocks should get you about 6-12MB/sec. So 4 hours is quite slow. And 8-9 hours for a restore of 40GB probably won't be very pleasant if you have a boss or customer breathing down your neck... Any reason why Postgresql would only get 2.8MB/sec for dumps or slower? Regards, Link.
On Tue, 20 Sep 2005, Lincoln Yeoh wrote: > At 10:00 AM 9/20/2005 -0400, Vivek Khera wrote: > > >> On Sep 14, 2005, at 9:45 AM, vinita bansal wrote: >> >>> I have a 4 proc. AMD Opteron machine with 32 GB RAM and ~400GB HDD Just curious what ever came of this? Also, were you reading the DB and writing the dump file from the same file system? Different partitions on the same disk? The 400GB is presumably a RAID1+0 or RAID5? If that's the case then, I would highly recommend having a separate physical drive/file system for writing backups to (from which your actual backup software should be pointed). Maybe even put it on a different channel on the same controller, or a different controller alltogether.. Obviously, the biggest thing missing from a lot of PostgreSQL documentation is practicle information for large deployments, including strategy and system design requirements. All in due-time I suppose. ~BAS > > Any reason why Postgresql would only get 2.8MB/sec for dumps or slower? > > Regards, > Link. > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > l8* -lava x.25 - minix - bitnet - plan9 - 110 bps - ASR 33 - base8