On Tue, 6 Jan 2015 08:26:22 -0500
Robert Haas <robertmhaas@gmail.com> wrote:
> Three, scan the WAL generated since the incremental backup and summarize it
> into a list of blocks that need to be backed up.
This can be done from the archive side. I was talking about some months ago
now:
http://www.postgresql.org/message-id/51C4DD20.3000103@free.fr
One of the traps I could think of it that it requires "full_page_write=on" so
we can forge each block correctly. So collar is that we need to start a diff
backup right after a checkpoints then.
And even without "full_page_write=on", maybe we could add a function, say
"pg_start_backupdiff()", which would force to log full pages right after it
only, the same way "full_page_write" does after a checkpoint. Diff backups would
be possible from each LSN where we pg_start_backupdiff'ed till whenever.
Building this backup by merging versions of blocks from WAL is on big step.
But then, there is a file format to define, how to restore it and to decide what
tools/functions/GUCs to expose to admins.
After discussing with Magnus, he adviced me to wait for a diff backup file
format to emerge from online tools, like discussed here (by the time, that was
Michael's proposal based on pg_basebackup that was discussed). But I wonder how
easier it would be to do this the opposite way? If this idea of building diff
backup offline from archives is possible, wouldn't it remove a lot of trouble
you are discussing here?
Regards,
--
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com