Thread: Backups / replication
Hello, I'm interested in using WAL shipping / replication for backup purposes but have no interest in failover. Currently my situationis: I have two servers, live and backup, which are in different cities. The backup server is also a test/development machine. Backups of my most important database are made hourly with pg_dump, excluding some larger tables with non-critical loggingdata. Even so, as the database grows, backups are taking longer and it looks as though they may start to impact performance.A full backup is made nightly and transferred to the backup machine, along with all of the day's hourly backups. I'm looking into using replication by WAL shipping - after all, there's no use to backing up data which hasn't changed sincelast time - only a small percentage of records are created/updated. However, I need a) to be able to restore to a point in time easily, which I can do to within an hour at the moment by restoring the correctdump. Sometimes users ask for a restore having accidentally updated/deleted records. b) to carry on running a test server database, that means one that's read and writeable. I obviously can't use a replication slave as a read/write test server at the same time. At the moment I've thought of a coupleof options, I don't know if either are possible - I have a bit of a hazy idea of WAL replication. 1) Continuously ship the WAL records to somewhere on the test server unknown to Postgres but run the test machine as a normaldatabase completely separately. If a backup is needed, delete the test database, restore to the last full backup (afilesystem backup?) and copy all WAL records into Postgres' directory so it can see them. Start it up configured to replaythem, up to a certain time. 2) Run two instances of Postgres on the test/backup server on different ports, one configured as a replication slave, onenormal. I'm not sure if this is possible with the RPM builds I'm using. Are either of those two likely? Any other suggestions? Another question is will the replication coming in v9.0 change thingsand would it be worth holding off until then? In particular Command Prompt's PITR tools look useful for restoring toa particular point in time, will these still work or will there be equivalents? Regards Oliver Kohll oliver@agilebase.co.uk / +44(0)7814 828608 / skype:okohll
On Tuesday, June 15, 2010, "Oliver Kohll - Mailing Lists" <oliver.lists@gtwm.co.uk> wrote: > Are either of those two likely? Any other suggestions? Another question > is will the replication coming in v9.0 change things and would it be > worth holding off until then? In particular Command Prompt's PITR tools > look useful for restoring to a particular point in time, will these > still work or will there be equivalents? PITR in recent versions allows restoration to any point in time after the base backup was created, assuming you have the WAL logs from that point forward. -- "No animals were harmed in the recording of this episode. We tried but that damn monkey was just too fast."
[continuous backup] On Tuesday 15 June 2010 21.42:52 Oliver Kohll - Mailing Lists wrote: > 1) Continuously ship the WAL records to somewhere on the test server > unknown to Postgres but run the test machine as a normal database > completely separately. If a backup is needed, delete the test database, > restore to the last full backup (a filesystem backup?) and copy all WAL > records into Postgres' directory so it can see them. Start it up > configured to replay them, up to a certain time. > > 2) Run two instances of Postgres on the test/backup server on different > ports, one configured as a replication slave, one normal. I'm not sure > if this is possible with the RPM builds I'm using. Both scenarious are possible. I don't know the rpm builds you're using; the Debian packages allow configuring two instances on two different ports AFAIK. Possibly the rpm installation do, too. Even if not: hacking up a 2nd start script which runs postgres against a different data directory / config file should be quite trivial. Keeping the base backup plus all the WAL files for the case you need to restore will need quite a bit of diskspace if your database is reasonably big (on some database I administrated, I scheduled weekly base backups and kept a week of WAL - since we sometimes had quite a lot changes in the db, WAL was quickly 10 times as big as the base backup. So depending on your DB load, keeping a 2nd installation of postgres running and continuously reading the WAL files might be cheaper in terms of disk space. (and with 9.0, you even have a near real-time read-only copy of the db for free gratis...) cheers -- vbi -- 90% of the people do not understand copyright, the other 10% simply ignore it. -- Aigars Mahinovs
Attachment
Adrian von Bidder wrote: > I don't know the rpm builds you're using; the > Debian packages allow configuring two instances on two different ports > AFAIK. Possibly the rpm installation do, too. Even if not: hacking up a > 2nd start script which runs postgres against a different data directory / > config file should be quite trivial. > The situation is midway between here: you do have to hack up the startup scripts a bit to get more than one server running with an RPM install, but the changes are not too terrible. There's a sample and article about it at http://blog.2ndquadrant.com/en/2010/05/install-multiple-postgresql-servers-redhat-linux.html You are correct that this is much easier on Debian. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us