Re: Why we really need timelines *now* in PITR - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Why we really need timelines *now* in PITR |
Date | |
Msg-id | 22853.1090252736@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Why we really need timelines *now* in PITR (Simon Riggs <simon@2ndquadrant.com>) |
Responses |
Re: Why we really need timelines *now* in PITR
Re: Why we really need timelines *now* in PITR |
List | pgsql-hackers |
Simon Riggs <simon@2ndquadrant.com> writes: > Some further thinking from that base... > Perhaps timelines should be nest-numbered: (using 0 as a counter also) > 0 etc is the original branch > 0.1 is the first recovery off the original branch > 0.2 is the second recovery off the original branch > 0.1.1 is the first recovery off the first recovery (so to speak) > 0.1.2 is the second etc > That way you don't have the problem of "which is 3?" in the examples > above. [Would we number a recovery of 1 as 3 or would then next recovery > off 2 be numbered 3?] Hmm. This would have some usefulness as far as documenting "how did we get here", but unless you knew where/when the timeline splits had occurred, I don't think it would be super useful. It'd make more sense to record the parentage and split time of each timeline in some human-readable "meta-history" reference file (but where exactly?) I don't think it does anything to solve our immediate problem, anyhow. You may know that you are recovering off of branch 0.1, but how do you know if this is the first, second, or Nth time you have done that? > Not necessarily the way we would show that as a timeline number. It > could still be shown as a single hex number representing each nesting > level as 4 bits...(restricting us to 7 recoveries per timeline...) Sounds too tight to me :-( I do see a hole in my original concept now that you mention it. It'd be quite possible for timeline 2 *not* to be an ancestor of timeline 3, that is you might have tried a restore, not liked the result, and decided to re-restore from someplace else on timeline 1. That is, instead of 0001.0014 - 0001.0015 - 0001.0016 - 0001.0017 - ... | + 0002.0016 - 0002.0017 -... | + 0003.0017 - ... maybe the history is 0001.0014 - 0001.0015 - 0001.0016 - 0001.0017 - ... | | | +0003.0017 - ... | + 0002.0016 - 0002.0017 - ... where I've had to draw 3 above 2 to avoid having unrelated lines crossing each other in my diagram. The problem here is that a crash recovery in timeline 3 would not realize that it should not use WAL segment 0002.0016. So we need a more sophisticated rule than just numerical comparison of timeline numbers. I think your idea of nested numbers might fix this, but I'm concerned about the restrictions of fitting it into 32 bits as you suggest. Can we think of a less restrictive representation? > If we go with the renaming recovery.conf when it completes, why not make > that the record of previous recoveries? Move it to archive_status and > name it according to the timeline it just created, e.g. > recovery.done.<timeline>.<timestamp> There's still the problem of how can you be sure that all the files created in the past are still in there. It'd be way too likely for someone to decide they ought to do a recovery restore by first doing "rm -rf $PGDATA". Or they lost the disk entirely and are restoring their last full backup onto virgin media. I think there's really no way around the issue: somehow we've got to keep some meta-history outside the $PGDATA area, if we want to do this in a clean fashion. We could perhaps expect the archive area to store it, but I'm a bit worried about the idea of overwriting the meta-history file in archive from time to time; it's mighty critical data and you'd not be happy if a crash corrupted your only copy. We could archive meta-history files with successively higher versioned names ... but then we need an API extension to get back the latest one. regards, tom lane
pgsql-hackers by date: