Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>
>>> Tom Lane wrote:
>>>
>>>> Good, but the salient followup questions to that are (1) backed up to
>>>> where exactly?, and (2) how many days' past backups could we get at,
>>>> if we had to?
>>>>
>>
>>
>>> They are dumped to a NFS share on this schedule. That NFS share is
>>> dumped to tape by systems at Conova - I'll let Stefan fill in the
>>> details about that.
>>>
>>
>> That's good as far as it goes, but seeing that PG is a worldwide
>> organization now, I wonder whether our primary CVS shouldn't have
>> backups on several continents. Pardon my paranoia ... but our
>> collective arses have been saved by offsite backups at least once
>> already ...
>>
>>
>>
>
> Yes, I think we could improve on that. Have we considered more
> sophisticated solutions that provide incremental backup on a more
> frequent basis? I'd be inclined to use Bacula or similar (and it uses
> Postgres for its metadata store :-) ). Ideally I think we'd like to be
> able fairly easily and quickly to roll the repo (or some portion of it)
> back to a fairly arbitrary and fairly precise (say within an hour or
> two) point in recent time.
well yeah - while I do think that something as complex like bacula is
probably overkill for our needs we can certainly improve over the
current state.
One thing to consider is that we actually have two major scenarios to
deal with:
1. simple repo corruption (accident,cvs software bug, admin error)
this one might require us to restore the repo from an older backup in
the case the corruption cannot be repaired easily.
For this we already have myriads of copies of the trees out in the wild
but i might be a good idea to keep a number snapshots of the main repo
on the CVS-VPS itself (either done every few hours or made as part of
the push to anoncvs and svr1).
This way we could do a very simple "inplace" recovery on the same
running VPS instance with fairly low inpact to all the commiters (and
depending parts of teh infrastructure)
2. total loss of the main CVS-VPS (extended power failure, hardware
error, OS bug, admin error, fire, some other catastropic event) - in
this case we will have to fail over to one of the other project
datacenters and for this we need to have full regular copies of the
whole VM on external hosts.
Stefan