Re: [RFC] Incremental backup v2: add backup profile to base backup - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [RFC] Incremental backup v2: add backup profile to base backup |
Date | |
Msg-id | CA+TgmoYdG1JvymERkGozpfazJBHTNbxSAvWMHGmK7dRioP8bAQ@mail.gmail.com Whole thread Raw |
In response to | Re: [RFC] Incremental backup v2: add backup profile to base backup (Marco Nenciarini <marco.nenciarini@2ndquadrant.it>) |
Responses |
Re: [RFC] Incremental backup v2: add backup profile to
base backup
|
List | pgsql-hackers |
On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: >> 1. Take a full backup. Basically, we already have this. In the >> backup label file, make sure to note the newest LSN guaranteed to be >> present in the backup. > > Don't we already have it in "START WAL LOCATION"? Yeah, probably. I was too lazy to go look for it, but that sounds like the right thing. >> 2. Take a differential backup. In the backup label file, note the LSN >> of the fullback to which the differential backup is relative, and the >> newest LSN guaranteed to be present in the differential backup. The >> actual backup can consist of a series of 20-byte buffer tags, those >> being the exact set of blocks newer than the base-backup's >> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by >> an 8kB block of data. If a relfilenode is truncated or removed, you >> need some way to indicate that in the backup; e.g. include a buffertag >> with forknum = -(forknum + 1) and blocknum = the new number of blocks, >> or InvalidBlockNumber if removed entirely. > > To have a working backup you need to ship each block which is newer than > latest-guaranteed-to-be-present in full backup and not newer than > latest-guaranteed-to-be-present in the current backup. Also, as a > further optimization, you can think about not sending the empty space in > the middle of each page. Right. Or compressing the data. > My main concern here is about how postgres can remember that a > relfilenode has been deleted, in order to send the appropriate "deletion > tag". You also need to handle truncation. > IMHO the easiest way is to send the full list of files along the backup > and let to the client the task to delete unneeded files. The backup > profile has this purpose. > > Moreover, I do not like the idea of using only a stream of block as the > actual differential backup, for the following reasons: > > * AFAIK, with the current infrastructure, you cannot do a backup with a > block stream only. To have a valid backup you need many files for which > the concept of LSN doesn't apply. > > * I don't like to have all the data from the various > tablespace/db/whatever all mixed in the same stream. I'd prefer to have > the blocks saved on a per file basis. OK, that makes sense. But you still only need the file list when sending a differential backup, not when sending a full backup. So maybe a differential backup looks like this: - Ship a table-of-contents file with a list relation files currently present and the length of each in blocks. - For each block that's been modified since the original backup, ship a file called delta_<original file name> which is of the form <block number><changed block contents> [...]. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: