On Thu, Jul 16, 2009 at 4:41 PM, Heikki
Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:
> Rick Gigger wrote:
>> If you use an rsync like algorithm for doing the base backups wouldn't
>> that increase the size of the database for which it would still be
>> practical to just re-sync? Couldn't you in fact sync a very large
>> database if the amount of actual change in the files was a small
>> percentage of the total size?
>
> It would certainly help to reduce the network traffic, though you'd
> still have to scan all the data to see what has changed.
The fundamental problem with pushing users to start over with a new
base backup is that there's no relationship between the size of the
WAL and the size of the database.
You can plausibly have a system with extremely high transaction rate
generating WAL very quickly, but where the whole database fits in a
few hundred megabytes. In that case you could be behind by only a few
minutes and have it be faster to take a new base backup.
Or you could have a petabyte database which is rarely updated. In
which case it might be faster to apply weeks' worth of logs than to
try to take a base backup.
Only the sysadmin is actually going to know which makes more sense.
Unless we start tieing WAL parameters to the database size or
something like that.
--
greg
http://mit.edu/~gsstark/resume.pdf