Thread: Incremental backup
How's this issue going on the 7.4 development tree? I saw it on the TODO list, but didn't find much on the archives of this mailing list. -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? ----------------------------------------------------------------- Martín Marqués | mmarques@unl.edu.ar Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral -----------------------------------------------------------------
Someone at Red Hat is working on point-in-time recovery, also known as incremental backups. It will be in 7.4. --------------------------------------------------------------------------- Martin Marques wrote: > How's this issue going on the 7.4 development tree? > I saw it on the TODO list, but didn't find much on the archives of this > mailing list. > > -- > Porqu? usar una base de datos relacional cualquiera, > si pod?s usar PostgreSQL? > ----------------------------------------------------------------- > Mart?n Marqu?s | mmarques@unl.edu.ar > Programador, Administrador, DBA | Centro de Telematica > Universidad Nacional > del Litoral > ----------------------------------------------------------------- > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> Someone at Red Hat is working on point-in-time recovery, also known as > incremental backups. It will be in 7.4. Does that mean that the poor guy/gal is implementing redo for all the index types? Chris
Christopher Kings-Lynne wrote: > > Someone at Red Hat is working on point-in-time recovery, also known as > > incremental backups. It will be in 7.4. > > Does that mean that the poor guy/gal is implementing redo for all the index > types? No idea. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > > Someone at Red Hat is working on point-in-time recovery, also known as > incremental backups. PITR and incremental backup are different beasts. PITR deals with a backup + logs. Incremental backup deals with a full backup + X smaller/incremental backups. So... it doesn't look like anyone is working on incremental backup at the moment. Cheers, Patrick > It will be in 7.4. > > --------------------------------------------------------------------------- > > Martin Marques wrote: > > How's this issue going on the 7.4 development tree? > > I saw it on the TODO list, but didn't find much on the archives of this > > mailing list. > > > > -- > > Porqu? usar una base de datos relacional cualquiera, > > si pod?s usar PostgreSQL? > > ----------------------------------------------------------------- > > Mart?n Marqu?s | mmarques@unl.edu.ar > > Programador, Administrador, DBA | Centro de Telematica > > Universidad Nacional > > del Litoral > > ----------------------------------------------------------------- > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Patrick Macdonald wrote: > Bruce Momjian wrote: > > > > Someone at Red Hat is working on point-in-time recovery, also known as > > incremental backups. > > PITR and incremental backup are different beasts. PITR deals with a backup > + logs. Incremental backup deals with a full backup + X smaller/incremental > backups. > > So... it doesn't look like anyone is working on incremental backup at the > moment. But why would someone want incremental backups compared to PITR? The backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems pretty weird. :-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > > Patrick Macdonald wrote: > > Bruce Momjian wrote: > > > > > > Someone at Red Hat is working on point-in-time recovery, also known as > > > incremental backups. > > > > PITR and incremental backup are different beasts. PITR deals with a backup > > + logs. Incremental backup deals with a full backup + X smaller/incremental > > backups. > > > > So... it doesn't look like anyone is working on incremental backup at the > > moment. > > But why would someone want incremental backups compared to PITR? The > backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems > pretty weird. :-) Yeah, it's a different method of producing a similar outcome. However, many companies do not want to be concerned with the management (and space) of archived logs. Incremental backup allows them the option of performing a full backup and then only backing up the modifications on a regular basis. When it's time to restore, they'll restore the full backup and then the proper sequence of incremental backups. Cheers, Patrick
Patrick Macdonald wrote: > > But why would someone want incremental backups compared to PITR? The > > backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems > > pretty weird. :-) > > Yeah, it's a different method of producing a similar outcome. However, many > companies do not want to be concerned with the management (and space) > of archived logs. Incremental backup allows them the option of performing > a full backup and then only backing up the modifications on a regular basis. > When it's time to restore, they'll restore the full backup and then the > proper sequence of incremental backups. Wow, I never even thought that was possible. Do other db's support that feature? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> Wow, I never even thought that was possible. Do other db's support that > feature? Isn't that basically what the current replication kits for Postgresql do -- via triggers and log tables? -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Bruce Momjian wrote: > > Patrick Macdonald wrote: > > > Yeah, it's a different method of producing a similar outcome. However, many > > companies do not want to be concerned with the management (and space) > > of archived logs. Incremental backup allows them the option of performing > > a full backup and then only backing up the modifications on a regular basis. > > When it's time to restore, they'll restore the full backup and then the > > proper sequence of incremental backups. > > Wow, I never even thought that was possible. Do other db's support that > feature? I know Oracle and DB2 have incremental backup in their arsenal (and iirc, SQL Server has something called "differential backup"). Whatever the name, it's a win at the enterprise level. Cheers, Patrick
On Thu, 13 Feb 2003 19:24:13 -0500, Patrick Macdonald <patrickm@redhat.com> wrote: >I know Oracle and DB2 have incremental backup in their arsenal (and iirc, >SQL Server has something called "differential backup"). Whatever the name, >it's a win at the enterprise level. "A differential backup copies only the database pages that have been modified after the last full database backup." This could be doable using XLogRecPtr pd_lsn in the page headers, but I don't see an easy way to do it on a live database. ServusManfred
On Jue 13 Feb 2003 16:38, Bruce Momjian wrote: > Patrick Macdonald wrote: > > Bruce Momjian wrote: > > > Someone at Red Hat is working on point-in-time recovery, also known as > > > incremental backups. > > > > PITR and incremental backup are different beasts. PITR deals with a > > backup + logs. Incremental backup deals with a full backup + X > > smaller/incremental backups. > > > > So... it doesn't look like anyone is working on incremental backup at the > > moment. > > But why would someone want incremental backups compared to PITR? The > backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems > pretty weird. :-) Good backup systems, such as Informix (it's the one I used) doesn't do a query backup, but a pages backup. What I mean is that it looks for pages in the system that has changed from the las full backup and backs them up. That's how an incremental backup works. PITR is another thing, which is even more important. :-) -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? ----------------------------------------------------------------- Martín Marqués | mmarques@unl.edu.ar Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral -----------------------------------------------------------------
OK, once we have PITR, will anyone want incremental backups? --------------------------------------------------------------------------- Martin Marques wrote: > On Jue 13 Feb 2003 16:38, Bruce Momjian wrote: > > Patrick Macdonald wrote: > > > Bruce Momjian wrote: > > > > Someone at Red Hat is working on point-in-time recovery, also known as > > > > incremental backups. > > > > > > PITR and incremental backup are different beasts. PITR deals with a > > > backup + logs. Incremental backup deals with a full backup + X > > > smaller/incremental backups. > > > > > > So... it doesn't look like anyone is working on incremental backup at the > > > moment. > > > > But why would someone want incremental backups compared to PITR? The > > backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems > > pretty weird. :-) > > Good backup systems, such as Informix (it's the one I used) doesn't do a query > backup, but a pages backup. What I mean is that it looks for pages in the > system that has changed from the las full backup and backs them up. > > That's how an incremental backup works. PITR is another thing, which is even > more important. :-) > > -- > Porqu? usar una base de datos relacional cualquiera, > si pod?s usar PostgreSQL? > ----------------------------------------------------------------- > Mart?n Marqu?s | mmarques@unl.edu.ar > Programador, Administrador, DBA | Centro de Telematica > Universidad Nacional > del Litoral > ----------------------------------------------------------------- > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Vie 14 Feb 2003 09:52, Bruce Momjian wrote: > OK, once we have PITR, will anyone want incremental backups? I will probably not need it, but I know of people how have databases which build dumps of more then 20GB. They are interested in live incremental backups. -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? ----------------------------------------------------------------- Martín Marqués | mmarques@unl.edu.ar Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral -----------------------------------------------------------------
On Fri, 2003-02-14 at 06:52, Bruce Momjian wrote: > OK, once we have PITR, will anyone want incremental backups? > > --------------------------------------------------------------------------- > > Martin Marques wrote: > > On Jue 13 Feb 2003 16:38, Bruce Momjian wrote: > > > Patrick Macdonald wrote: > > > > Bruce Momjian wrote: > > > > > Someone at Red Hat is working on point-in-time recovery, also known as > > > > > incremental backups. > > > > > > > > PITR and incremental backup are different beasts. PITR deals with a > > > > backup + logs. Incremental backup deals with a full backup + X > > > > smaller/incremental backups. > > > > > > > > So... it doesn't look like anyone is working on incremental backup at the > > > > moment. > > > > > > But why would someone want incremental backups compared to PITR? The > > > backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems > > > pretty weird. :-) > > > > Good backup systems, such as Informix (it's the one I used) doesn't do a query > > backup, but a pages backup. What I mean is that it looks for pages in the > > system that has changed from the las full backup and backs them up. > > > > That's how an incremental backup works. PITR is another thing, which is even > > more important. :-) I do imagine for some people it will register high on their list. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Bruce Momjian wrote: > > OK, once we have PITR, will anyone want incremental backups? None of my database references (Date's "Introduction to Database Systems" and Garcia-Molina's "Database Systems - The Complete Book", in particular) seem to talk about PITR at all. At least, there's no index entry for it. And a google search for "point in time recovery" yields mostly marketing fluff. Is there a good reference for this that someone can point me to? I'm interested in exactly how it'll work, especially in terms of how logs are stored versus the main data store, effects on performance, etc. Thanks, and sorry for the newbie question. :-( -- Kevin Brown kevin@sysexperts.com
A long time ago, in a galaxy far, far away, kevin@sysexperts.com (Kevin Brown) wrote: > Bruce Momjian wrote: >> >> OK, once we have PITR, will anyone want incremental backups? > > None of my database references (Date's "Introduction to Database > Systems" and Garcia-Molina's "Database Systems - The Complete Book", > in particular) seem to talk about PITR at all. At least, there's no > index entry for it. And a google search for "point in time recovery" > yields mostly marketing fluff. Well, from an "academic DBMS" standpoint, it isn't terribly interesting, since it involves assumptions of messy imperfection that academics prefer to avoid. And that's not intended to insult the academics; it is often reasonable to leave that "out of scope" much as an academic OS researcher might prefer to try to avoid putting attention on things like binary linkers, text file editors, and SCM systems like CVS, which, while terribly important from a practical standpoint, don't make for interesting OS research. > Is there a good reference for this that someone can point me to? > I'm interested in exactly how it'll work, especially in terms of how > logs are stored versus the main data store, effects on performance, > etc. > > Thanks, and sorry for the newbie question. :-( Unfortunately, the best sources I can think of are in the "O-Word" literature, and the /practical/ answers require digging into really messy bits of the documentation. What it amounts to is that anyone that isn't a "near-O*****-guru" would be strongly advised not to engage in PITR activity. It doesn't surprise me overly that the documentation is poor: those that can't figure it out despite the challenges almost surely shouldn't be using the functionality... What PITR generally consists of is the notion that you want to recover to the state at a particular moment in time. In O*****-nomenclature, this means that you recover as at some earlier moment for which you have a good backup, and then re-apply changes, which in their terms, are kept in "archive logs," which are somewhat analagous to WAL files. -- If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me http://www3.sympatico.ca/cbbrowne/x.html "We blew it -- too big, too slow..." - Bill Gates talking about NT, as noted by Steven McGeady of Intel during a meeting with Gates
Christopher Browne wrote: > What PITR generally consists of is the notion that you want to recover > to the state at a particular moment in time. > > In O*****-nomenclature, this means that you recover as at some earlier > moment for which you have a good backup, and then re-apply changes, > which in their terms, are kept in "archive logs," which are somewhat > analagous to WAL files. Yeah, that's pretty much what I figured. Oracle has something they call "rollback segments" which I assume are separate bits of data that have enough information to reverse changes that were made to the database during a transaction, and I figured PITR would (or could) apply particular saved rollback segments to the current state in order to "roll back" a table, tablespace, or database to the state it was in at a particular point in time. As it is, it sounds like PITR is a bit less refined than I expected. So the relevant question is: how is *our* PITR going to work? In particular, how is it going to interact with our WAL files and the table store? If I'm not mistaken, right now (well, as of 7.2 anyway) we round robin through a fixed set of WAL files. For PITR, I assume we'd need an archivelog function that would copy the WAL files as they're checkpointed to some other location (with destination names that reflect their order in time), just for starters. It'd be *awfully* nice if you could issue a command to roll a table (or, perhaps, a tablespace, if you've got a bunch of foreign keys and such) back to a particular point in time, from the command line, with no significant advance preparation (so long as the required files are still around, and if they're not then abort the operation with the appropriate error message). But it doesn't sound like that's what we're talking about when we talk about PITR... I wouldn't expect the O***** docs to be particularly revealing about how the database manages PITR at the file level, but if it does, would you happen to know where so I can look at it? What I've seen so far is very basic and not very revealing at all... -- Kevin Brown kevin@sysexperts.com
On Fri, 14 Feb 2003, Bruce Momjian wrote: > OK, once we have PITR, will anyone want incremental backups? Well, I'm not entirely clear on how PITR will work, so I may be off-base here, but it seems to me that offering incremental backups that back up only changed pages might not be all that big a win, given how postgres writes its pages. On DBMSs that don't use MVCC, if you change a particular item in a row ten times, one page is changed. If you do it in postgres, you could well be changing ten pages, as the system writes the two copies of the entire row wherever it can find space. So in databases where a lot of rows are changed, where an incremental backup would normally be a win because it would be much smaller than the logs over a given period, it isn't going to be with postgres. But you know, if we could get rid of redundant changes in the logs we're using for backup, that could save a lot of space in a situation like the one I described above. If a particular row and column is changed fifty times over the course of a month, it's going to be recorded fifty times in the log. But there's really no need for all fifty of those, if you don't mind not being able to restore to any time before the current time. You can reduce the size of the logs you need to store for backup by throwing away the first forty-nine of those changes, and keeping only the most recent version. There shouldn't be any worries about referential integrity, because when you do a restore, you start with a full backup that is ok, and once you've successfully applied all the transactions in the log, you know it will be ok again, so any intermediate states during the restore where integrity is not maintained are not a problem. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC
Curt Sampson <cjs@cynic.net> writes: > ... But there's really no need for all fifty of those, > if you don't mind not being able to restore to any time before the > current time. Which, of course, is exactly the point of PITR designs. When you know that your assistant trainee DBA deleted most of your database with a mistyped command last Tuesday evening around 8pm, it is cold comfort to know that your database has faithfully preserved his committed changes. You want to get back to where you were Tuesday afternoon, or preferably Tuesday evening 7:59pm. This is what PITR setups can do for you. If you don't feel you need PITR capability, fine ... but don't tell the people who want it that they have no need for it. regards, tom lane
On Fri, 14 Feb 2003, Kevin Brown wrote: > Oracle has something they call "rollback segments" which I assume are > separate bits of data that have enough information to reverse changes > that were made to the database during a transaction, and I figured > PITR would (or could) apply particular saved rollback segments to the > current state in order to "roll back" a table, tablespace, or database > to the state it was in at a particular point in time. You're right about what rollback segments are; they hold the information about previous versions of a row, so that if a transaction is rolled back, the previous data can be restored. (Postgres doesn't need this, since it doesn't update the rows in place: the old copy of the row stays as and where it is, and a new copy is made with the new data. [The old copy can later be removed, after no transactions reference it any more, with a VACUUM.]) Oracle9i has a new feature called a "flashback query," which uses the information in the rollback segements to let you query the database in a previous state. (I.e., "select such and such from this table as of two hours ago.") Postgres could do this using the older copies of rows as well, though the performance often wouldn't be pretty, since your indexes become useless, I believe. (Don't they point to only the latest copy of a row?) Still, it would be cool and life-saving in some situations. But yeah, PITR takes a snapshot and goes in a forward direction, not a backwards one. This is just what Oracle does, too, using the redo logs. > For PITR, I assume we'd need an archivelog function that would > copy the WAL files as they're checkpointed to some other location > (with destination names that reflect their order in time), just for > starters. Well, I'm not sure that you'd really need to have any special archiving facilities: you just need to let the old files sit there, and keep creating new ones. You do lose a bit of performance in that you can't recycle log segments, but you could always just fork off a (well niced) process to create a new, empty log segment at the time you start in on the last pre-created one, so that you will have another pre-created one ready when you finish the current one. BTW, why exactly do we pre-create log segments, anyway? I see this comment in backend/access/transam/xlog.c: /* * Zero-fill the file. We have to do this the hard way to ensure that * all the file space has really been allocated--- on platforms that * allow "holes" in files, just seeking to the end doesn't allocate * intermediate space. This way, we know that we have all the space * and (after the fsync below) that all the indirect blocks are down * on disk. Therefore, fdatasync(2) or O_DSYNC will be sufficient to * sync future writes to the log file. */ This seems to imply to me that fdatasync will, when synchronizing the data blocks of a file, not necessarially synchronize the indirect blocks, which seems a little...odd. It's not like there's much point in writing the data to the disk if you can't get to it. I'd understood fdatasync as just a way to avoid updaing the inode's last changed date. Are there OSes that implement fdatasync in a way that you could still lose data? > It'd be *awfully* nice if you could issue a command to roll a table > (or, perhaps, a tablespace, if you've got a bunch of foreign keys and > such) back to a particular point in time, from the command line, with > no significant advance preparation (so long as the required files are > still around, and if they're not then abort the operation with the > appropriate error message). But it doesn't sound like that's what > we're talking about when we talk about PITR... I don't think most people are thinking of that when they think of PITR; I think they're thinking of applying changes from a log to a previous version of a database. And you can't do such a rollback at all, except on an entire database, because of the potential integrity violations. The best you could do would be to generate SQL for the changes you'd need to get back to the previous point, and see if you can execute these changes. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC
On Sat, 15 Feb 2003, Tom Lane wrote: > Curt Sampson <cjs@cynic.net> writes: > > ... But there's really no need for all fifty of those, > > if you don't mind not being able to restore to any time before the > > current time. > > Which, of course, is exactly the point of PITR designs. > > When you know that your assistant trainee DBA deleted most of your > database with a mistyped command last Tuesday evening around 8pm, > it is cold comfort to know that your database has faithfully preserved > his committed changes. You want to get back to where you were Tuesday > afternoon, or preferably Tuesday evening 7:59pm. This is what PITR > setups can do for you. > > If you don't feel you need PITR capability, fine ... but don't tell > the people who want it that they have no need for it. Hey, I never said you *have* to do this compression! I envisioned it as an option. I'd like, for example, to be able to aim a program that the last eight weeks worth of log files and say, "compress the first seven weeks of this, but leave the last week fully intact." Then I can save some space (quite a lot, if my updates have certain characteristics), and yet still get back to Tuesday evening at 7:59 p.m. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC
Curt Sampson <cjs@cynic.net> writes: > Oracle9i has a new feature called a "flashback query," which uses the > information in the rollback segements to let you query the database > in a previous state. (I.e., "select such and such from this table as > of two hours ago.") Postgres could do this using the older copies of > rows as well, Yeah, good ol' time-travel. This was built into Postgres in Berkeley days, and later ripped out for performance reasons. > though the performance often wouldn't be pretty, since > your indexes become useless, I believe. (Don't they point to only the > latest copy of a row?) No. If they did, they'd not work under MVCC. > BTW, why exactly do we pre-create log segments, anyway? Partly because the logic is designed to work in the PITR case, but mostly because we don't want to suffer an out-of-disk-space condition while we are in the midst of using a log segment. > This seems to imply to me that fdatasync will, when synchronizing the > data blocks of a file, not necessarially synchronize the indirect > blocks, which seems a little...odd. The comment is being paranoid about whether fdatasync is correctly implemented everywhere. Whether you consider this worry justified or not is academic, since the possible out-of-disk-space failure mode is reason enough to do it anyway. regards, tom lane
Kevin, Hi. I was looking into PITR for PostgreSQL myself about a year back but life intervened. I am an Oracle DBA so may be able to help you with an understanding of how Oracle does this. You wrote: > Oracle has something they call "rollback segments" which I assume are > separate bits of data that have enough information to reverse changes > that were made to the database during a transaction, and I figured > PITR would (or could) apply particular saved rollback segments to the > current state in order to "roll back" a table, tablespace, or database > to the state it was in at a particular point in time. > > As it is, it sounds like PITR is a bit less refined than I expected. Actually Oracle uses its redo logs (like PostgreSQL' WAL) and archivelogs (copies of old redo logs) rather than rollback segments. The way it works, more or less, is that you start from a hot backup, and roll-forward using archivelogs and redo logs to the point in time to which you wish to recover. The whole point of this is not so much that we can restore our database to some point in the past, as that we can recover to just before some disaster struck, from a starting point of a previous hot backup. In fact, the whole PITR thing for Oracle seems to be simply an enabling technology for hot backups. This seems to me to be one of the killer enterprise features that PostgreSQL currently lacks. > So the relevant question is: how is *our* PITR going to work? In > particular, how is it going to interact with our WAL files and the > table store? If I'm not mistaken, right now (well, as of 7.2 anyway) > we round robin through a fixed set of WAL files. For PITR, I assume > we'd need an archivelog function that would copy the WAL files as > they're checkpointed to some other location (with destination names > that reflect their order in time), just for starters. I believe that three things are needed: 1) filesystem-based hot backups. It may be possible to simply copy the database files even though they would be open. Oracle places each tablespace into a hot-backup mode prior to taking the copy. In this mode, updates to the files are queued-up to be applied once the tablespace is taken out of hot-backup mode. This all seems quite tricky and will slow the database down. 2) Auto-archiving of WAL files. Just as you suggest. 3) A recovery controller that can figure out the state of the restored database, and manage the reapplication of archived and current WAL files. > > It'd be *awfully* nice if you could issue a command to roll a table > (or, perhaps, a tablespace, if you've got a bunch of foreign keys and > such) back to a particular point in time, from the command line, with > no significant advance preparation (so long as the required files are > still around, and if they're not then abort the operation with the > appropriate error message). But it doesn't sound like that's what > we're talking about when we talk about PITR... Nice but lots of work for questionable benefit. If your application needs to keep track of history there are better ways to do it. If not, then PITR should be seen as an exceptional circumstance and so need not be so easy to manage. > I wouldn't expect the O***** docs to be particularly revealing about > how the database manages PITR at the file level, but if it does, would > you happen to know where so I can look at it? What I've seen so far > is very basic and not very revealing at all... The best Oracle documentation on this is available through Oracle technet (technet.oracle.com) for which you will have to need to register. Look for documentation->Oracle 9i Documentation->list of books->Backup and Recovery Concepts. -- Marc marc@bloodnok.com
Curt Sampson wrote: > Oracle9i has a new feature called a "flashback query," which uses the > information in the rollback segements to let you query the database > in a previous state. (I.e., "select such and such from this table as > of two hours ago.") Postgres could do this using the older copies of > rows as well, though the performance often wouldn't be pretty, since > your indexes become useless, I believe. (Don't they point to only the > latest copy of a row?) Still, it would be cool and life-saving in some > situations. That sounds like a really nice feature. We don't do rollback segments, but I suppose we could (perhaps at the expense of performance). > > For PITR, I assume we'd need an archivelog function that would > > copy the WAL files as they're checkpointed to some other location > > (with destination names that reflect their order in time), just for > > starters. > > Well, I'm not sure that you'd really need to have any special archiving > facilities: you just need to let the old files sit there, and keep > creating new ones. You do lose a bit of performance in that you can't > recycle log segments, but you could always just fork off a (well niced) > process to create a new, empty log segment at the time you start in on > the last pre-created one, so that you will have another pre-created one > ready when you finish the current one. It's probably just as easy to fork off a well-niced process to copy the log segments elsewhere. If you're a shop trying to eke out the last bit of performance from PostgreSQL, then you're probably going to put the transaction logs on the fastest bit of disk you've got -- a solid-state disk if you can. That means the amount of space you've got on the filesystem holding the logs isn't necessarily all that high. For PITR you almost certainly want to put the old log files somewhere else for safekeeping: since you're not using them anymore and you probably want to keep a lot of them, the old log files need to go somewhere that has tons of space and is highly reliable. For that area, performance is the least of your concerns, as long as it's fast enough that you can keep up with the database as it writes transaction logs. And chances are you'll want a background process that compresses the old log files to minimize the amount of space they eat. > It's not like there's much point in writing the data to the disk if > you can't get to it. I'd understood fdatasync as just a way to avoid > updaing the inode's last changed date. Are there OSes that > implement fdatasync in a way that you could still lose data? I wouldn't think so. Otherwise what would be the point of having it? > And you can't do such a rollback at all, except on an entire database, > because of the potential integrity violations. The best you could do > would be to generate SQL for the changes you'd need to get back to the > previous point, and see if you can execute these changes. Well, you could do the rollback for all tables involved in the referential integrity tree and leave it at that, but yeah it's probably simpler to rollback the whole database. It would still be a useful feature, but I don't know that it's worth the upfront performance cost of having to save off rollback segments. If you did have such a feature you'd want to be able to control whether or not it's enabled, so that you take the performance hit only if you need the feature. -- Kevin Brown kevin@sysexperts.com