Re: backup-strategies for large databases - Mailing list pgsql-general

From Greg Smith
Subject Re: backup-strategies for large databases
Date
Msg-id 4E4AE468.9040204@2ndQuadrant.com
Whole thread Raw
In response to backup-strategies for large databases  (MirrorX <mirrorx@gmail.com>)
List pgsql-general
On 08/13/2011 05:44 PM, MirrorX wrote:
> at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
> compression of it and the storing of it in a local storage disk takes about
> 60 hours while the file size is about 550 GB. the archives are kept in a
> different location so that not a problem. so, i dont want even to imagine
> how much time the uncompress and copy will take in 'disaster' scenario.
>

If you haven't actually run this test--confirmed that you can uncompress
the whole thing and get a working copy out of it again--I'd be concerned
that you haven't tested your backup procedure fully.  You can't really
tell if a backup is good or not unless you restore it.  And that process
will get you a read on just how bad the recovery situation will look
like if it comes to that one day.

One technique I've used to accelerate the situation you're in is to
always keep a real filesystem copy of the last backup somewhere.  Then,
rather than archive the main database directly for the base backup, you
execute rsync to make that secondary copy identical to the one on the
master.  That should happen quite a bit faster than making a whole new
backup, so long as you use the --inplace option.  Once the standby copy
is done, if you want a compressed archive you can then make it from the
copy--with no extra load on the master.  And you can then copy that
again to another place too, followed by having it consume WAL files so
that it eventually turns into a warm standby.  If you want a true
fail-over here, you're going to have to make one that is replaying WAL
files as they arrive.

> any (file-system) solutions that keep the disks at sync like DRDB
> are suitable?so that the disk of the 2nd server would be at sync with the
> 1st. even if that works, i would still like to have a 3rd backup in the
> storage disks so my question remains.
>

I doubt you'll be able to get DRDB to keep up with the volume you've got
reliably.  The only filesystem level solution I've seen scale nicely to
handle the exact problem you have is using ZFS snapshots to make some of
this easier.  It's worth buying a Solaris license for some people to
have that technology available.

I had been hoping some of the new things in FreeBSD 9.0 would finally
make it a lot more practical to consider for this sort of thing once
that ships.  But it looks like the issues around not supporting Intel's
latest graphics drivers on recent "Sandy Bridge" servers may postpone
adopting that further for me.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


pgsql-general by date:

Previous
From: raghu ram
Date:
Subject: Re: get old versions (8.3.8 or 8.4.1)
Next
From: Chris Travers
Date:
Subject: Re: INSERTing rows from external file