Thread: WAL -> Replication
Caveat: I'm not a pg hacker, I apologize in advance if this is a dumb question, but it has been nagging at me, and I don't know who else to ask. If the WAL is a record of all transactions, and if the checkpoint process can be managed tightly, is it possible to copy the WAL files from a master DB and use them to keep a slave DB in sync? This seems like it would be an easy way to slave a backup system without additional load on the primary....
Mike Biamonte wrote: > > > Caveat: I'm not a pg hacker, I apologize > in advance if this is a dumb question, but > it has been nagging at me, and I don't > know who else to ask. > > If the WAL is a record of all transactions, > and if the checkpoint process can be managed > tightly, is it possible to copy the WAL > files from a master DB and use them to keep > a slave DB in sync? This seems like it > would be an easy way to slave a backup system > without additional load on the primary.... WAL files are kept only until an fsync(), checkpoint, then reused. Also, the info is tied to direct locations in the file. You could do this for hot backup, but it would require quite bit of coding to make it work. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Thu, 25 Apr 2002, Bruce Momjian wrote: > WAL files are kept only until an fsync(), checkpoint, then reused. One could keep them longer though, if one really wanted to. > Also, the info is tied to direct locations in the file. You could do > this for hot backup, but it would require quite bit of coding to make it > work. That's kind of too bad, since log shipping is a very popular method of backup and replication. Not that I'm volunteering to fix this. :-) cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC
On Fri, 2002-04-26 at 07:38, Curt Sampson wrote: > On Thu, 25 Apr 2002, Bruce Momjian wrote: > > > WAL files are kept only until an fsync(), checkpoint, then reused. > > One could keep them longer though, if one really wanted to. > > > Also, the info is tied to direct locations in the file. You could do > > this for hot backup, but it would require quite bit of coding to make it > > work. > > That's kind of too bad, since log shipping is a very popular method of > backup and replication. Now again from my just aquired DB2 knowledge: DB2 can run in two modes 1) similar to ours, where logs are reused after checkpoints/commits allow it. 2) with log archiving: logs are never reused, but when system determines it no longer needs them, it will hand said log over to archiving process that will archive it (usually do a backup to some other place and then delete it). This mode is used when online backup and restore functionality is desired. This is something that could be interesting for 24x7 reliability. ----------------- Hannu
Hannu Krosing <hannu@tm.ee> writes: > DB2 can run in two modes > 1) similar to ours, where logs are reused after checkpoints/commits > allow it. > 2) with log archiving: logs are never reused, but when system determines > it no longer needs them, it will hand said log over to archiving process > that will archive it (usually do a backup to some other place and then > delete it). There is in fact the skeleton of support in xlog.c for passing unwanted log segments over to an archiver, rather than recycling them. So far no one's done anything with the facility. I think the main problem is the one Bruce cited: because the WAL representation is tied to physical tuple locations and so forth, it's only useful to a slave that has an *exact* duplicate of the master's entire database cluster. That's not useless, but it's pretty restrictive. It could be useful for incremental backup, though I'm not sure how efficient it is for the purpose. WAL logs tend to be pretty voluminous. At the very least you'd probably want enough smarts in the archiver to strip out the page-image records. regards, tom lane
Tom Lane wrote: > Hannu Krosing <hannu@tm.ee> writes: > > DB2 can run in two modes > > 1) similar to ours, where logs are reused after checkpoints/commits > > allow it. > > 2) with log archiving: logs are never reused, but when system determines > > it no longer needs them, it will hand said log over to archiving process > > that will archive it (usually do a backup to some other place and then > > delete it). > > There is in fact the skeleton of support in xlog.c for passing unwanted > log segments over to an archiver, rather than recycling them. So far > no one's done anything with the facility. I think the main problem is > the one Bruce cited: because the WAL representation is tied to physical > tuple locations and so forth, it's only useful to a slave that has an > *exact* duplicate of the master's entire database cluster. That's not > useless, but it's pretty restrictive. > > It could be useful for incremental backup, though I'm not sure how > efficient it is for the purpose. WAL logs tend to be pretty voluminous. > At the very least you'd probably want enough smarts in the archiver to > strip out the page-image records. Yes, I think the bottom line is that we would need to add some things to the WAL file to make archiving the logs work, for either point-in-time recovery, or replication, both of which we need. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Fri, 2002-04-26 at 19:41, Tom Lane wrote: > Hannu Krosing <hannu@tm.ee> writes: > > DB2 can run in two modes > > 1) similar to ours, where logs are reused after checkpoints/commits > > allow it. > > 2) with log archiving: logs are never reused, but when system determines > > it no longer needs them, it will hand said log over to archiving process > > that will archive it (usually do a backup to some other place and then > > delete it). > > There is in fact the skeleton of support in xlog.c for passing unwanted > log segments over to an archiver, rather than recycling them. So far > no one's done anything with the facility. I think the main problem is > the one Bruce cited: because the WAL representation is tied to physical > tuple locations and so forth, it's only useful to a slave that has an > *exact* duplicate of the master's entire database cluster. That's not > useless, but it's pretty restrictive. It is probably the fastest way to creating functionality for a hot spare database. If we could ship the log changes even earlier than whole logs are complete, we can get near-realtime backup server. > It could be useful for incremental backup, though I'm not sure how > efficient it is for the purpose. WAL logs tend to be pretty voluminous. But if they contain enough repeated data they should compress quite well. > At the very least you'd probably want enough smarts in the archiver to > strip out the page-image records. If we aim for ability to restore the last known good state and not any point of time in between, the archiving can be just playing back the logs over sparse files + keeping record (bitmap or list) of pages that have been updated and thus are really present in the file. Then doing full restore would be just restoring some point of time online backup plus copying over changed pages. ---------------- Hannu