Thread: Review of last summer's PITR patch
I've done some preliminary work on the PITR patch that J.R. Nield developed last summer. I've applied parts of it (in rather edited form) but there are big chunks I think are not ready to go. You can see the original patch in pgsql-patches --- I sent it there mostly to get it into the archives. I have committed the parts that have to do with making the contents of WAL more robust for PITR purposes; specifically, labeling WAL segment files clearly, and adding WAL logging of file creation/deletion. I have not committed the massive reorganization of xlog.c that appears in J.R.'s patch; I think it's unnecessary and likely to introduce bugs (certainly it would revert some recent bug fixes). I have also not committed the ALTER SYSTEM BACKUP / ALTER SYSTEM RECOVER commands that appear in the patch. It is not clear to me that this functionality belongs in the backend rather than separate management utilities, and even if it does belong there, this doesn't seem a clean way to do it. On the backup side, it seems like this code is basically reinventing the tar(1) command. On the restore side, I don't care for the "interactive command" implementation of ALTER SYSTEM RECOVER --- that seems like it makes it unnecessarily hard to drive the recovery process from another program. I envision this process being controlled by some kind of higher-level management utility, so I'd prefer to see a program-friendly API instead of one designed for manual use. Anyway, I'm hoping to see some discussion of what to do next and what the PITR functionality ought to look like from a user's standpoint. regards, tom lane
On Thu, 2004-02-12 at 02:41, Tom Lane wrote: > I've done some preliminary work on the PITR patch that J.R. Nield > developed last summer. I've applied parts of it (in rather edited > form) but there are big chunks I think are not ready to go. You > can see the original patch in pgsql-patches --- I sent it there > mostly to get it into the archives. > > I have committed the parts that have to do with making the contents of > WAL more robust for PITR purposes; specifically, labeling WAL segment > files clearly, and adding WAL logging of file creation/deletion. > > I have not committed the massive reorganization of xlog.c that appears > in J.R.'s patch; I think it's unnecessary and likely to introduce bugs > (certainly it would revert some recent bug fixes). > > I have also not committed the ALTER SYSTEM BACKUP / ALTER SYSTEM RECOVER > commands that appear in the patch. It is not clear to me that this > functionality belongs in the backend rather than separate management > utilities, and even if it does belong there, this doesn't seem a clean > way to do it. On the backup side, it seems like this code is basically > reinventing the tar(1) command. On the restore side, I don't care for > the "interactive command" implementation of ALTER SYSTEM RECOVER --- > that seems like it makes it unnecessarily hard to drive the recovery > process from another program. I envision this process being controlled > by some kind of higher-level management utility, so I'd prefer to see a > program-friendly API instead of one designed for manual use. > > Anyway, I'm hoping to see some discussion of what to do next and what > the PITR functionality ought to look like from a user's standpoint. As a user of this functionality the first thing I would want to do is look at a certain point in time, say yesterday around 3:30 pm and get a window of transactions made at that time. So maybe an object method or function call which passes the following parameters: 1) Time, 2) number of transactions to return with time being the middle of that window. i.e. $transaction_list = $transaction_log_object->transaction_window({ time => '23:59:59.99-12', window => '150' }); Which returns 150 transactions centered around 11:59:59. If the transaction window exceeds the end of the transaction log move the window back accordingly. Once I have that list I would want to look at the individual transactions and determine which one I want to roll the database to.
On Wed, 2004-02-11 at 19:41, Tom Lane wrote: > Anyway, I'm hoping to see some discussion of what to do next and what > the PITR functionality ought to look like from a user's standpoint. As previously discussed on general ... * WAL files archived to a different location instead of recycling. * The ability to force a WAL log switch to ensure all changes during the backup are flushed to archived logs and copied. * Ability to easily apply WAL logs to a standby database. I'd love be able to take a hot backup of my production database, bring it up on another computer and keep it a log or two behind production by continually copying and applying logs to it. * Although not PITR, on a related note, having the ability to do incremental pg_dumps would be a huge boon for those relying on pg_dumps for backups. * Oracle 10g's Flashback feature is interesting. You can roll the entire database back to a point in time with: > flashback database to '3:00 pm'; I would have to say it's hardly critical. :)
On Thu, 2004-02-12 at 18:14, Tom Lane wrote: > Cott Lang <cott@internetstaff.com> writes: > > * The ability to force a WAL log switch to ensure all changes during the > > backup are flushed to archived logs and copied. > > Why does that require a log switch? You can copy the active log file in > any case. (There was actually code to do that in J.R.'s patch, which I > disregarded because I see no point in it ...) Maybe it doesn't, I'm certainly no expert in PG internals. :) It just seems like a Good Thing (TM) to ensure that any possible changes to the data files during the backup are in the logs that are copied to the archive destination and backed up as part of the hot backup.
Cott Lang <cott@internetstaff.com> writes: > * The ability to force a WAL log switch to ensure all changes during the > backup are flushed to archived logs and copied. Why does that require a log switch? You can copy the active log file in any case. (There was actually code to do that in J.R.'s patch, which I disregarded because I see no point in it ...) regards, tom lane
The big feature I am being pestered for is : - hot backup that (subsequent) logs can be applied to during recovery The patch in its current form provides this - which is excellent. However, some "easyness" enhancements could be in order : e.g : - automatically archiving logs after a hot backup to <somewhere> - some sort of ability to detect if the archived log sequence is "broken" anywhere after the lastest hot backup. I would like to see PITR as straightforward as possible to setup. Backups that are actually unrestorable - because the Dba didn't understand the product's convoluted "design" - are still way too frequent AFAICS. cheers Mark