Re: a few questions on backup - Mailing list pgsql-general
From | Hannes Dorbath |
---|---|
Subject | Re: a few questions on backup |
Date | |
Msg-id | 4648820F.9030209@theendofthetunnel.de Whole thread Raw |
In response to | a few questions on backup (Marco Colombo <pgsql@esiway.net>) |
List | pgsql-general |
On 14.05.2007 16:54, Marco Colombo wrote: > I have a few questions on backuping a PostgreSQL server (lets say > anything 8.x.x). I've read "Continuous Archiving and Point-In-Time > Recovery (PITR)" in the manual I'm still missing something...well > actually I think I don't but I've been debating on this with a friend > for a while, and there's something we don't seem to agree on, so I need > some bits of clarification. :) > > Ok, let's say what I really need is "poor man's" PITR. That is, I > perform only one daily full backup, and I need to be able to restore the > database contents as they were at backup time. I don't even need to know > the exact time (yes, that means I don't really care about which > transactions turn out to be committed and which don't). Mmm, ok that > can't be rightfully called PITR at all, maybe. > > Am I right in assuming that the following procedure is ok? > > 1) issue pg_start_backup(); > 2) copy (or tar or cpio) the data dir, w/o pg_xlog/ > 3) issue pg_stop_backup(); > 4) copy (or tar or cpio) pg_xlog/ contents. > > That's all. Please note that I'm doing NO WAL archiving. > > Whether this is going to work or not is based on what exactly > pg_start_backup() does. According to him (my friend), it may simply > prevent PostgreSQL from writing to data files until pg_stop_backup() is > issued, thus kind of enforcing a snapshot on the data files themselves. > Changes go to WAL only, and they are "played" on data files only after > the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't > even need step 4), since the tar backup would be consistent. I'm able to > restore from that only, and data are those at the time of pg_start_backup(). > > Or, (that's me) pg_start_backup() only ensures that full_page_writes is > enabled during the backup phase, so that any change to data files can be > undone/redone based on the contents of WAL segments. The tar archive > contains potentially inconsistant data, that's why I need WAL segments > too. At restore time, PG eventually performs a crash recovery, and data > are those at the time of step 4) (which is an interval really, some time > between the start and the end of the copy, but I don't need to be able > to tell the exact time anyway). > > BTW, I see a vulnerability in the above procedure... if there's enough > write activity so that PostgreSQL recycles WAL segments between 1) and > 4), some changes may be lost. At step 4) I need to save _all_ WAL > segments that have been produced during the backup time. I assume that > activity at backup time is low enough that the event is "extremely > unlikely". Another doubt I have is about WAL checkpointing... is it > possible that a checkpoint happens during step 2), and the tar archive > gets data files both from before and from after the checkpoint (which I > think is bad) or does pg_start_backup() prevent WAL checkpointing, too? > > Finally, if I'm missing something and the above is wrong, I think that > the only way to perform a full backup on a live database at filesystem > level, is to enable WAL archiving as the first step of the backup > procedure (assuming it's not usually on, of course), and later save all > the WAL segments that were _archived_ during that time, including the > one made at pg_stop_backup(), on the same backup medium. > > If I understand the documentation right, nothing breaks if the > archive_command just returns OK w/o saving anything during regular > operation, and starts saving segments only during the backup time... I > mean, that prevents me from doing arbitrary PITR after the backup, but > all I want to do is performing a full backup, w/o being able to do any > partial backup after that. If so, I may write a script that does nothing > most of the time, and archives WAL segments only to be included in the > full backup. Are file system snapshots not the best approach for your requirements? lvcreate -s -L5G -nbackup /dev/foo/postgresql mount /dev/foo/backup /mnt/backup-snap tar jcpvf pg-backup-<time_stamp>.bz2 /mnt/backup-snap You can't do much wrong with that, it's fast and easy to use. Just my 2cent.. -- Regards, Hannes Dorbath
pgsql-general by date: