On Friday 09 February 2007 08:16, Doug Knight wrote:
> I would also be interested in any "creative" ways to reduce the size and
> time to backup databases/clusters. We were just having a conversation
> about this yesterday. We were mulling over things like using rsync to
> only backup files in the database directory tree that actually changed.
> Or maybe doing a selective backup of files based on modified times, etc,
> but were unsure if this would be a safe, reliable way to backup a
> reduced set of data.
>
The biggest factor with backing up TB databases is the amount of activity you
have going on. If you have a fairly static data set, you can try using rsync
or custom pg_dump scripts piping specific tables to something like split as
needed. Both of these methods will require some finagling, so make sure to
test them before using them. You can probably also use something like Slony,
though the initial data copy will prove painful, but again on a fairly static
set of data it could work.
If you're doing large amounts of transactional activity, these methods are
going to become unworkable. We push about 2 GB of WAL an hour on one of our
systems, and the only method that seems workable is using PITR with weekly
filesystem snapshots as the base and then copying the xlogs offline for
re-play. It's still tricky to get right, but it seems to work.
--
Robert Treat
Database Architect
OmniTI Computer Consulting, Inc.