PITR Phase 1 - Test results - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | PITR Phase 1 - Test results |
Date | |
Msg-id | 1082991844.3999.60.camel@stromboli Whole thread Raw |
Responses |
Re: PITR Phase 1 - Test results
PITR Phase 1 - Code Overview (1) |
List | pgsql-hackers |
I've now completed the coding of Phase 1 of PITR. This allows a backup to be recovered and then rolled forward (all the way) on transaction logs. This proves the code and the design works, but also validates a lot of the earlier assumptions that were the subject of much earlier debate. As noted in the previous designs, PostgreSQL talks to an external archiver using the XLogArchive API. I've now completed: - changes to PostgreSQL - written a simple archiving utility, pg_arch Using both of these together, I have successfully: - started pg_arch - started postgres - taken a backup using tar - ran pgbench for an extended period, so that the transaction logs taken at the start have long since been recycled - killed postmaster - wait for completion - rm -R $PGDATA - restore using tar - restore xlogs from archive directory - start postmaster and watch it recover to end of logs This has been tested through a number of times on non-trivial tests and I've sat and watch the beast at work to make sure nothing wierd was happening on timing. At this stage: Missing Functions - - recovery does NOT yet stop at a specified point-in-time (that was always planned for Phase 2) - few more log messages required to report progress - debug mode required to allow most to be turned off Wrinkles - code is system testable, but not as cute as it could be - input from committers is now sought to complete the work - you are strongly advised not to treat any of the patches as usable in any real world situation YET - that bit comes next Bugs - two bugs currently occur during some tests: 1. the notification mechanism as originally designed causes ALL backends to report that a log file has closed. That works most of the time, though does give rise to occaisional timing errors - nothing too serious, but this inexactness could lead to later errors. 2. After restore, the notification system doesn't recover fully - this is a straightforward one I'm building a full patchset for this code and will upload this soon. As you might expect over the time its taken me to develop this, some bitrot has set in, so I'm rebuilding it against the latest dev version now, and will complete fixes for the two bugs mentioned above. I'm sure some will say "no words, show me the code"... I thought you all would appreciate some advance warning of this, to plan time to investigate and comment upon the coding. Best Regards, Simon Riggs, 2ndQuadrant http://www.2ndquadrant.com
pgsql-hackers by date: