Re: How to do fast, reliable backups? - Mailing list pgsql-admin
From | Silvana Di Martino |
---|---|
Subject | Re: How to do fast, reliable backups? |
Date | |
Msg-id | 200403061329.17799.silvanadimartino@tin.it Whole thread Raw |
In response to | How to do fast, reliable backups? (Chris Ruprecht <chris@ruprecht.org>) |
Responses |
Re: How to do fast, reliable backups?
Re: How to do fast, reliable backups? |
List | pgsql-admin |
Alle 21:37, venerdì 5 marzo 2004, Chris Ruprecht ha scritto: > I am wondering how you guys back up your databases. Say, I have a 20 GB > database, data and indexes. If I run pg_dump on this, it backs up the > schema and the data. When I have to restore this, I whould have to run this > through psql which would then re-build the indexes after it has inserted > the records. That's not really a feasable option as it takes too long. When you have large databases and you need consistent backups, an interesting solution is master-slave replication. Just use a second server of your LAN as a slave replica of your main database server. Should you have any problem with the main server, just switch the IP addresses of the master and slave servers and you are up and running again. The switching takes a few milliseconds and can even be performed automatically by a few specific programs (look for "fault-tolerant linux systems" with google). While your backup server serves your users, you have all the time to install a new hard disk into the main server, restore your data (from the running slave server or from a tape) and come back to service. When you need to put your data into a tape or a CD-ROM, just stop the master-slave mechanism, make a "static" backup (a pg_dump or a file system copy) of your _slave_ server and restart the replication program. The master-slave replication mechanism will take care to re-align the two servers in a few minutes. This way, you never need to stop the main server. > If I back up the pgdata directory, I will get inconsistent data, as > somebody might update data in 2 tables while my backup is running, one > change in a table I have already backed up, one in a table I have not. The > solution would be to shut the database server down, do the backup and then > start everything back up. But that's not really an option in a 24/7 > environment. Making a copy of your data directory while your server is running is definitely a bad idea. Forget it. Backing up the file system used by a RDBMS usually requires to put the server off-line and flush all the unwritten data to the disk. This cannot be done in a 24/7 environment. > What I'd like to see is a transaction aware backup which backs up indexes > as well as data. Have a look at one of the master-slave replication systems available on the net. For this task you do not need anything as sophisticated as a multi-master replication system (Like the Bettina Kemme "PG-Replicator" project). You could even be able to write down your own client program with Perl o Python. The program just have to connect to your running master server, read the relevant data and insert them into the database on the slave server. Using a SERIAL primary key it is quite easy to spot which data of a table have not been copied to the slave server yet. Using a timer, you can define the time granularity of the replication mechanism that best fit your need. Using transactions you can be sure to get just consistent data. Reading and saving system tables, you can backup even the most hidden and subtle things of your DB. Beware that restoring your data is up to you, if you use a custom-made system. Test the restore system before using your program in a production environment. Good luck. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
pgsql-admin by date: