Thread: Backup - filesystem snapshots
Hi, I'm trying to implement a backup strategy for a research database in order to prevent again users accidentally dropping their data. My preferred method would be to create regular snapshots of the data directory, and then send this to the backup server using rsync, with hard-linking backup rotation. The backup data directories could then be examined using a postmaster running on the backup server to extract any accidentally deleted tables. My problem is how to do these snapshots: is it enough to create a hard link to the directory, or is there still a risk that a currently running transaction might introduce inconsistencies? I guess I could use the pg_ctl -m 'Smart' command to stop the database after all clients have disconnected, but I sometimes have users leaving their clients connected all night. Is there any other way to suspend the postmaster such that it finishes its current transaction and queues any other transactions while the snapshot is taking place? Any other ideas of how I can create such snapshots? Thanks, Chris -- Chris Jewell Dept of Maths and Statistics Fylde College Lancaster University Lancaster Lancs LA1 4YF
Chris,
I am having the same problem here and I am considering using SVN for DB backup, i.e. every snapshot (whichever way you do it, pg_dump seems OK)
gets checked in on a regular basis, then if a table is accidentally deleted, it can still be checked out and restored. In case you do not know,
SVN (Subversion, http://subversion.tigris.org) is like CVS (Concurrent version system).
Of course the SVN repository itself needs to be backed up somewhere.
Best regards,
Dmitri
On Wed, 2006-01-18 at 20:11 +0000, Chris Jewell wrote:
I am having the same problem here and I am considering using SVN for DB backup, i.e. every snapshot (whichever way you do it, pg_dump seems OK)
gets checked in on a regular basis, then if a table is accidentally deleted, it can still be checked out and restored. In case you do not know,
SVN (Subversion, http://subversion.tigris.org) is like CVS (Concurrent version system).
Of course the SVN repository itself needs to be backed up somewhere.
Best regards,
Dmitri
On Wed, 2006-01-18 at 20:11 +0000, Chris Jewell wrote:
Hi, I'm trying to implement a backup strategy for a research database in order to prevent again users accidentally dropping their data. My preferred method would be to create regular snapshots of the data directory, and then send this to the backup server using rsync, with hard-linking backup rotation. The backup data directories could then be examined using a postmaster running on the backup server to extract any accidentally deleted tables. My problem is how to do these snapshots: is it enough to create a hard link to the directory, or is there still a risk that a currently running transaction might introduce inconsistencies? I guess I could use the pg_ctl -m 'Smart' command to stop the database after all clients have disconnected, but I sometimes have users leaving their clients connected all night. Is there any other way to suspend the postmaster such that it finishes its current transaction and queues any other transactions while the snapshot is taking place? Any other ideas of how I can create such snapshots? Thanks, Chris
c.jewell@lancaster.ac.uk (Chris Jewell) writes: > I'm trying to implement a backup strategy for a research database in > order to prevent again users accidentally dropping their data. > > My preferred method would be to create regular snapshots of the data > directory, and then send this to the backup server using rsync, with > hard-linking backup rotation. The backup data directories could then be > examined using a postmaster running on the backup server to extract any > accidentally deleted tables. > > My problem is how to do these snapshots: is it enough to create a hard > link to the directory, or is there still a risk that a currently running > transaction might introduce inconsistencies? I guess I could use the > pg_ctl -m 'Smart' command to stop the database after all clients have > disconnected, but I sometimes have users leaving their clients connected > all night. Is there any other way to suspend the postmaster such that > it finishes its current transaction and queues any other transactions > while the snapshot is taking place? Any other ideas of how I can create > such snapshots? Short answer to "is there a risk?" is "You betcha!" There's a fairly new feature called Point in Time Recovery that is specifically designed to address those risks. <http://www.postgresql.org/docs/8.1/static/backup-online.html> -- let name="cbbrowne" and tld="ntlug.org" in String.concat "@" [name;tld];; http://cbbrowne.com/info/advocacy.html The English exam was a piece of cake---which was a bit of a surprise, actually, because I was expecting some questions on a sheet of paper.
Hi, Thanks for all the suggestions. One thing I had done, before we got the backup server, was to write a Perl script that parsed the log file looking for SQL commands that altered data (eg CREATE, ALTER, UPDATE etc). I had, of course, set the client_min_messages = notice. Since I had confined my users to their own schemas, the script simply queries the database for the user list, checks to see if they have altered data, and does a pg_dump on their schema if they have. This actually was working reasonably well, but required a good deal of monitoring. I will go back to using this system if the WAL logging seems clunky. However, the disadvantage is that you need to confine users to their own schemas. Also, the script was written pre-roles. I'd probably have to modify it nowadays! If anyone is interested, I can post the Perl script. Chris PS The backups are really only done to protect users from themselves since we're running a RAID mirroring system - hence the PITR for a given user (as opposed to for the whole database). Chris Jewell wrote: > Hi, > > I'm trying to implement a backup strategy for a research database in > order to prevent again users accidentally dropping their data. > > My preferred method would be to create regular snapshots of the data > directory, and then send this to the backup server using rsync, with > hard-linking backup rotation. The backup data directories could then be > examined using a postmaster running on the backup server to extract any > accidentally deleted tables. > > My problem is how to do these snapshots: is it enough to create a hard > link to the directory, or is there still a risk that a currently running > transaction might introduce inconsistencies? I guess I could use the > pg_ctl -m 'Smart' command to stop the database after all clients have > disconnected, but I sometimes have users leaving their clients connected > all night. Is there any other way to suspend the postmaster such that > it finishes its current transaction and queues any other transactions > while the snapshot is taking place? Any other ideas of how I can create > such snapshots? > > Thanks, > > Chris -- Chris Jewell, BSc(Hons), BVSc, MRCVS Dept of Maths and Statistics Fylde College Lancaster University Lancaster Lancs LA1 4YF