Backing up large databases - Mailing list pgsql-admin

From Steve Burrows
Subject Backing up large databases
Date
Msg-id 44523B85.4020605@jla.com
Whole thread Raw
Responses Re: Backing up large databases  ("Andy Shellam" <andy.shellam@mailnetwork.co.uk>)
Re: Backing up large databases  (Robin Iddon <robin@edesix.com>)
Re: Backing up large databases  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Re: Backing up large databases  (<alex.cotarlan@thomson.com>)
Re: Backing up large databases  ("Uwe C. Schroeder" <uwe@oss4u.com>)
Re: Backing up large databases  (Simon Riggs <simon@2ndquadrant.com>)
Re: Backing up large databases  (Steve Burrows <steve@jla.com>)
List pgsql-admin
I am struggling to find an acceptable way of backing up a PostgreSQL 7.4 database.

The database is quite large, currently it occupies about 180GB, divided into two elements, a set of active tables and a set of archive tables which are only used for insertions.

I ran pg_dump -Fc recently, it took 23.5 hours to run, and output a single file of 126GB. Obviously as the database continues to grow it will soon be so large that it cannot be pg_dumped within a day. Running rsync to do a complete fresh copy of the pgsql file structure took 4 hours, but later that day running another iteration of rsync (which should have only copied changed files) took 3 hours, and I cannot afford to have the db down that long.

Anybody with any ideas? The database is being used as the backend for a mail server, so it has transactions 24 hours a day but is quieter at night. I want to be able to back it up or replicate it on a daily basis with minimum downtime so that the mail backlog doesn't get too large. Ideally I want the first generation of backup/replica going onto the same machine as the original because the volume of data is such that any attempt at network or tape backup of the live files will require too much downtime, once I've got a backup then I can copy that out to other NAS or tape at leisure.

If anyone has experience of safeguarding a similarly large PostgreSQL database with minimal downtime I'd be delighted to hear..  The machine is running 2 Xeons, 4GB ram and a half-terabyte RAID10 array on a DELL PERC scsi subsystem, with a load average of around 0.5 - 0.6, so it's not exactly overstretched.

Thanks,

Steve

pgsql-admin by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: New system recommendations
Next
From: "Andy Shellam"
Date:
Subject: Re: Backing up large databases