On Thu, Jul 16, 2009 at 1:09 PM, Greg Stark<gsstark@mit.edu> wrote:
> On Thu, Jul 16, 2009 at 4:41 PM, Heikki
> Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:
>> Rick Gigger wrote:
>>> If you use an rsync like algorithm for doing the base backups wouldn't
>>> that increase the size of the database for which it would still be
>>> practical to just re-sync? Couldn't you in fact sync a very large
>>> database if the amount of actual change in the files was a small
>>> percentage of the total size?
>>
>> It would certainly help to reduce the network traffic, though you'd
>> still have to scan all the data to see what has changed.
>
> The fundamental problem with pushing users to start over with a new
> base backup is that there's no relationship between the size of the
> WAL and the size of the database.
>
> You can plausibly have a system with extremely high transaction rate
> generating WAL very quickly, but where the whole database fits in a
> few hundred megabytes. In that case you could be behind by only a few
> minutes and have it be faster to take a new base backup.
>
> Or you could have a petabyte database which is rarely updated. In
> which case it might be faster to apply weeks' worth of logs than to
> try to take a base backup.
>
> Only the sysadmin is actually going to know which makes more sense.
> Unless we start tieing WAL parameters to the database size or
> something like that.
I think we need a way for the master to know who its slaves are and
keep any given bit of WAL available until all slaves have succesfully
read it, just as we keep each WAL file until we successfully copy it
to the archive. Otherwise, there's no way to be sure that a
connection break won't result in the need for a new base backup. (In
a way, a slave is very similar to an additional archive.)
...Robert