Thread: Backups / replication

Backups / replication

From
Oliver Kohll - Mailing Lists
Date:
Hello,

I'm interested in using WAL shipping / replication for backup purposes but have no interest in failover. Currently my
situationis: 

I have two servers, live and backup, which are in different cities. The backup server is also a test/development
machine.

Backups of my most important database are made hourly with pg_dump, excluding some larger tables with non-critical
loggingdata. Even so, as the database grows, backups are taking longer and it looks as though they may start to impact
performance.A full backup is made nightly and transferred to the backup machine, along with all of the day's hourly
backups.

I'm looking into using replication by WAL shipping - after all, there's no use to backing up data which hasn't changed
sincelast time - only a small percentage of records are created/updated. However, I need 

a) to be able to restore to a point in time easily, which I can do to within an hour at the moment by restoring the
correctdump. Sometimes users ask for a restore having accidentally updated/deleted records. 
b) to carry on running a test server database, that means one that's read and writeable.

I obviously can't use a replication slave as a read/write test server at the same time. At the moment I've thought of a
coupleof options, I don't know if either are possible - I have a bit of a hazy idea of WAL replication. 

1) Continuously ship the WAL records to somewhere on the test server unknown to Postgres but run the test machine as a
normaldatabase completely separately. If a backup is needed, delete the test database, restore to the last full backup
(afilesystem backup?) and copy all WAL records into Postgres' directory so it can see them. Start it up configured to
replaythem, up to a certain time. 

2) Run two instances of Postgres on the test/backup server on different ports, one configured as a replication slave,
onenormal. I'm not sure if this is possible with the RPM builds I'm using. 

Are either of those two likely? Any other suggestions? Another question is will the replication coming in v9.0 change
thingsand would it be worth holding off until then? In particular Command Prompt's PITR tools look useful for restoring
toa particular point in time, will these still work or will there be equivalents? 

Regards
Oliver Kohll

oliver@agilebase.co.uk / +44(0)7814 828608 / skype:okohll




Re: Backups / replication

From
Alan Hodgson
Date:
On Tuesday, June 15, 2010, "Oliver Kohll - Mailing Lists"
<oliver.lists@gtwm.co.uk> wrote:
> Are either of those two likely? Any other suggestions? Another question
> is will the replication coming in v9.0 change things and would it be
> worth holding off until then? In particular Command Prompt's PITR tools
> look useful for restoring to a particular point in time, will these
> still work or will there be equivalents?

PITR in recent versions allows restoration to any point in time after the
base backup was created, assuming you have the WAL logs from that point
forward.

--
"No animals were harmed in the recording of this episode. We tried but that
damn monkey was just too fast."

Re: Backups / replication

From
Adrian von Bidder
Date:
[continuous backup]

On Tuesday 15 June 2010 21.42:52 Oliver Kohll - Mailing Lists wrote:
> 1) Continuously ship the WAL records to somewhere on the test server
> unknown to Postgres but run the test machine as a normal database
> completely separately. If a backup is needed, delete the test database,
> restore to the last full backup (a filesystem backup?) and copy all WAL
> records into Postgres' directory so it can see them. Start it up
> configured to replay them, up to a certain time.
>
> 2) Run two instances of Postgres on the test/backup server on different
> ports, one configured as a replication slave, one normal. I'm not sure
> if this is possible with the RPM builds I'm using.

Both scenarious are possible.  I don't know the rpm builds you're using; the
Debian packages allow configuring two instances on two different ports
AFAIK.  Possibly the rpm installation do, too.  Even if not: hacking up a
2nd start script which runs postgres against a different data directory /
config file should be quite trivial.

Keeping the base backup plus all the WAL files for the case you need to
restore will need quite a bit of diskspace if your database is reasonably
big (on some database I administrated, I scheduled weekly base backups and
kept a week of WAL - since we sometimes had quite a lot changes in the db,
WAL was quickly 10 times as big as the base backup.  So depending on your DB
load, keeping a 2nd installation of postgres running and continuously
reading the WAL files might be cheaper in terms of disk space.

(and with 9.0, you even have a near real-time read-only copy of the db for
free gratis...)

cheers
-- vbi

--
90% of the people do not understand copyright,
the other 10% simply ignore it.
        -- Aigars Mahinovs

Attachment

Re: Backups / replication

From
Greg Smith
Date:
Adrian von Bidder wrote:
> I don't know the rpm builds you're using; the
> Debian packages allow configuring two instances on two different ports
> AFAIK.  Possibly the rpm installation do, too.  Even if not: hacking up a
> 2nd start script which runs postgres against a different data directory /
> config file should be quite trivial.
>

The situation is midway between here:  you do have to hack up the
startup scripts a bit to get more than one server running with an RPM
install, but the changes are not too terrible.  There's a sample and
article about it at
http://blog.2ndquadrant.com/en/2010/05/install-multiple-postgresql-servers-redhat-linux.html

You are correct that this is much easier on Debian.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us