Thread: online tape backup
Hi: I'm attempting to set up online backup as described on http://www.postgresql.org/docs/8.1/static/backup-online.html For this, i am following these steps: 1) pg_start_backup 2) tar to my tape device /data/postgresql excluding /data/postgresql/pg_xlog 3) pg_stop_backup 4) delete WAL files under /data/pg_trans_log/ (which is the directory to which archive_command copies my archived WAL files) with name "smaller" than the name of the last 000*.backup 5) append to previously created tar on the tape the WAL directory /data/pg_trans_log/ So, you see, i run tar 2 times, first for pg data dir, second for wal archive dir. Everything worked OK for about a week. Now the second tar (the one archiving the WAL files) exits with status 2, no errors on stderr. If i check the contents of the tar on the tape, they are wrong. Is there anything wrong with this procedure? Any other approaches to tape backup that do *not* involve using pg_dump (because of database size, dump time and performance issues)? Any help is greatly appreciated. -- Eduardo J. Ortega - Linux user #222873 "No fake - I'm a big fan of konqueror, and I use it for everything." -- Linus Torvalds
On Samstag, 10. Februar 2007 01:51 Eduardo J. Ortega wrote: > 1) pg_start_backup > 2) tar to my tape device /data/postgresql excluding > /data/postgresql/pg_xlog 3) pg_stop_backup > 4) delete WAL files under /data/pg_trans_log/ (which is the directory > to which archive_command copies my archived WAL files) with name > "smaller" than the name of the last 000*.backup > 5) append to previously created tar on the tape the WAL > directory /data/pg_trans_log/ Dear all, I'm replying to this message because it's a good start, and I'm thinking of switching from currently pg_dump on all databases to using the concept of base+WAL backup. What we did until now was (not using WAL) - pg_dump for each DB - vacuum full analyze for each DB after backup From what I understand, if I run a vacuum, the WAL logs will be enormous, as they do not simply store the vacuum command itself, but every single operation done on the db. If that's true, running vacuum before the base backup could be better, as there are less WAL logs to store, making the backup smaller, right? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi4.asc | gpg --import" // Fingerprint: EA39 8918 EDFF 0A68 ACFB 11B7 BA2D 060F 1C6F E6B0 // Keyserver: www.keyserver.net Key-ID: 1C6FE6B0
Attachment
On 13/02/07, Michael Monnerie <michael.monnerie@it-management.at> wrote: > On Samstag, 10. Februar 2007 01:51 Eduardo J. Ortega wrote: > > 1) pg_start_backup > > 2) tar to my tape device /data/postgresql excluding > > /data/postgresql/pg_xlog 3) pg_stop_backup > > 4) delete WAL files under /data/pg_trans_log/ (which is the directory > > to which archive_command copies my archived WAL files) with name > > "smaller" than the name of the last 000*.backup > > 5) append to previously created tar on the tape the WAL > > directory /data/pg_trans_log/ > > Dear all, I'm replying to this message because it's a good start, and > I'm thinking of switching from currently pg_dump on all databases to > using the concept of base+WAL backup. > > What we did until now was (not using WAL) > - pg_dump for each DB > - vacuum full analyze for each DB after backup > > From what I understand, if I run a vacuum, the WAL logs will be > enormous, as they do not simply store the vacuum command itself, but > every single operation done on the db. > > If that's true, running vacuum before the base backup could be better, > as there are less WAL logs to store, making the backup smaller, right? > No Not really, Since the base backup is just a dump of the database files only a Vacuum Full / Cluster / Reindex will actually shrink it plane Vacuum will just mark it all for reuse. A Vacuum of any sort will bloat the WAL so doing a vacuum pri to your base backups should mean that your WAL is smaller just after backup which may have advantages. Since you really ought to be using Autovacuum I would not worry about it too much. If you really want a small base backup in which case run a Vacuum Full before you backup but this may not shrink the actuall size of backup as the WAL for the privious backup will be larger. Peter.
Michael Monnerie <michael.monnerie@it-management.at> writes: > From what I understand, if I run a vacuum, the WAL logs will be > enormous, as they do not simply store the vacuum command itself, but > every single operation done on the db. > If that's true, running vacuum before the base backup could be better, > as there are less WAL logs to store, making the backup smaller, right? You're suffering from a fundamental misconception about the nature of WAL. Vacuum doesn't "shrink WAL", and neither does anything else; WAL is a history of every action ever taken in the database, and so a vacuum will just add a bunch more to that history. regards, tom lane
On Dienstag, 13. Februar 2007 16:34 Tom Lane wrote: > > From what I understand, if I run a vacuum, the WAL logs will be > > enormous, as they do not simply store the vacuum command itself, > > but every single operation done on the db. > > If that's true, running vacuum before the base backup could be > > better, as there are less WAL logs to store, making the backup > > smaller, right? > > You're suffering from a fundamental misconception about the nature of > WAL. Vacuum doesn't "shrink WAL", and neither does anything else; WAL > is a history of every action ever taken in the database, and so a > vacuum will just add a bunch more to that history. Seems you didn't understand me: When I make a vacuum, and then a base backup, I do not need to include the WAL records anymore. But when I do a base backup and afterwards vacuum, the WAL will be huge already, also making restore much longer. Is there a simple way to turn on/off WAL? I'd like to use it, but then switch it off, make vacuum, turn WAL on again and make a base backup. That should save a reasonable amount of time and backup space. Databases are about 40GB now (test phase) and will be considerable more once in production (we're using dbmail.org as mailserver then). mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi4.asc | gpg --import" // Fingerprint: EA39 8918 EDFF 0A68 ACFB 11B7 BA2D 060F 1C6F E6B0 // Keyserver: www.keyserver.net Key-ID: 1C6FE6B0
Attachment
Michael Monnerie <michael.monnerie@it-management.at> writes: > On Dienstag, 13. Februar 2007 16:34 Tom Lane wrote: >> You're suffering from a fundamental misconception about the nature of >> WAL. Vacuum doesn't "shrink WAL", and neither does anything else; > Seems you didn't understand me: When I make a vacuum, and then a base > backup, I do not need to include the WAL records anymore. But when I do > a base backup and afterwards vacuum, the WAL will be huge already, also > making restore much longer. This is irrelevant, at least in a steady-state environment. If you vacuum beforehand, the WAL history for that has to be included in what you need to recover from your previous base backup; and you can't discard that data until after you take the new backup. So AFAICS it's a wash; the average time-to-recover is the same either way. Or at least, VACUUM is not any different from any other burst of activity that you might want to schedule around. regards, tom lane