Re: Issues Outstanding for Point In Time Recovery (PITR) - Mailing list pgsql-hackers
From | J. R. Nield |
---|---|
Subject | Re: Issues Outstanding for Point In Time Recovery (PITR) |
Date | |
Msg-id | 1026946903.5300.390.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Issues Outstanding for Point In Time Recovery (PITR) (Bruce Momjian <pgman@candle.pha.pa.us>) |
Responses |
Re: Issues Outstanding for Point In Time Recovery (PITR)
|
List | pgsql-hackers |
On Wed, 2002-07-17 at 01:25, Bruce Momjian wrote: > > We only patch configure.in. If you post to hackers, they can give you > assistance and I will try to help however I can. I can so some > configure.in stuff for you myself. Thanks for the offer. The only thing I was changing it for was to test whether and how to get a ethernet MAC address using ioctl, so libuuid could use it if available. That is dropped now. > > > Related to that, the other place I need advice is on adding Ted Tso's > > LGPL'd UUID library (stolen from e2fsprogs) to the source. Are we > > allowed to use this? There is a free OSF/DCE spec for UUID's, so I can > > re-implement the library if required. > > We talked about this on the replication mailing list. We decided that > hostname, properly hashed to an integer, was the proper way to get this > value. Also, there should be a postgresql.conf variable so you can > override the hostname-generated value if you wish. I think that is > sufficient. I will do something like this, but reserve 16 bytes for it just in case we change our minds. It needs to be different among systems on the same machine, so there needs to be a time value and a pseudo-random part as well. Also, 'hostname' will likely be the same on many machines (localhost.localdomain or similar). The only reason I bothered with UUID's before is because they have a standard setup to make the possibility of collision extremely small, and I figured replication will end up using it someday. > > > We also haven't discussed commands for backup/restore, but I will use > > what I think is appropriate and we can change the grammar if needed. The > > initial hot-backup capability will require the database to be in > > read-only mode and use tar for backup, and I will add the ability to > > allow writes later. > > Yea, I saw Tom balked at that. I think we have enough manpower and time > that we can get hot backup in normal read/write mode working before 7.3 > beta so I would just code it assuming the system is live and we can deal > with making it hot-capable once it is in CVS. It doesn't have to work > 100% until beta time. Hot backup read/write requires that we force an advance in the logfile segment after the backup. We need to save all the logs between backup start and completion. Otherwise the files will be useless as a standalone system if the current logs somehow get destroyed (fire in the machine room, etc.). The way I would do this is: create a checkpoint do the block-by-block walk of the files using the bufmgr create a second checkpoint force the log toadvance past the end of the current segment save the log segments containing records between the first & second checkpontwith the backup Then if you restore the backup, you can recover to the point of the second checkpoint, even if the logs since then are all gone. Right now the log segment size is fixed, so this means that we'd waste 8MB of log space on average to do a backup. Also, the way XLOG reads records right now, we have to write placeholder records into the empty space, because that's how it finds the end of the log stream. So I need to change XLOG to handle "skip records", and then to truncate the file when it gets archived, so we don't have to save up to 16MB of zeros. Also, if archiving is turned off, then we can't recycle or delete any logs for the duration of the backup, and we have to save them. So I'll finish the XLOG support for this, and then think about the correct way to walk through all the files. -- J. R. Nield jrnield@usol.com
pgsql-hackers by date: