Re: need for in-place upgrades (was Re: State of - Mailing list pgsql-general

From Christopher Browne
Subject Re: need for in-place upgrades (was Re: State of
Date
Msg-id m31xuiwy28.fsf@wolfe.cbbrowne.com
Whole thread Raw
In response to Re: need for in-place upgrades (was Re: State of Beta 2)  ("Marc G. Fournier" <scrappy@postgresql.org>)
List pgsql-general
In the last exciting episode, ron.l.johnson@cox.net (Ron Johnson) wrote:
> On Sun, 2003-09-14 at 14:17, Christopher Browne wrote:
>> <http://spectralogic.com> discusses how to use their hardware and
>> software products to do terabytes of backups in an hour.  They sell a
>> software product called "Alexandria" that knows how to (at least
>> somewhat) intelligently backup SAP R/3, Oracle, Informix, and Sybase
>> systems.  (When I was at American Airlines, that was the software in
>> use._
>
> HP, Hitachi, and a number of other vendors make similar hardware.
>
> You mean the database vendors don't build that parallelism into
> their backup procedures?

They don't necessarily build every conceivable bit of possible
functionality into the backup procedures they provide, if that's what
you mean.

Of thee systems mentioned, I'm most familiar with SAP's backup
regimen; if you're using it with Oracle, you'll use tools called
"brbackup" and "brarchive", which provide a _moderately_ sophisticated
scheme for dealing with backing things up.

But if you need to do something wild, involving having two servers
each having 8 tape drives on a nearby servers that are used to manage
backups for a whole cluster of systems, including a combination of OS
backups, DB backups, and application backups, it's _not_ reasonable to
expect one DB vendor's backup tools to be totally adequate to that.

Alexandria (and similar software) certainly needs tool support from DB
makers to allow them to intelligently handle streaming the data out of
the databases.

At present, this unfortunately _isn't_ something PostgreSQL does, from
two perspectives:

 1.  You can't simply keep the WALs and reapply them in order to bring
     a second database up to date;

 2.  A pg_dump doesn't provide a way of streaming parts of the
     database in parallel, at least not if all the data is in
     one database.  (There's some nifty stuff in eRServ that
     might eventually be relevant, but probably not yet...)

There are partial answers:

 - If there are multiple databases, starting multiple pg_dump
   sessions provides some useful parallelism;

 - A suitable logical volume manager may allow splitting off
   a copy atomically, and then you can grab the resulting data
   in "strips" to pull it in parallel.

Life isn't always perfect.

>> Generally, this involves having a bunch of tape drives that are
>> simultaneously streaming different parts of the backup.
>>
>> When it's Oracle that's in use, a common strategy involves
>> periodically doing a "hot" backup (so you can quickly get back to a
>> known database state), and then having a robot tape drive assigned
>> to regularly push archive logs to tape as they are produced.
>
> Rdb does the same thing.  You mean DB/2 can't/doesn't do that?

I haven't the foggiest idea, although I would be somewhat surprised if
it doesn't have something of the sort.
--
(reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/wp.html
Rules of  the Evil Overlord #139. "If  I'm sitting in my  camp, hear a
twig  snap, start  to  investigate, then  encounter  a small  woodland
creature, I  will send out some scouts  anyway just to be  on the safe
side. (If they disappear into the foliage, I will not send out another
patrol; I will break out napalm and Agent Orange.)"
<http://www.eviloverlord.com/>

pgsql-general by date:

Previous
From: Christopher Browne
Date:
Subject: Re: How to remove CLUSTERs and 'partitioning tables'
Next
From: "Jeff Boes"
Date:
Subject: Fox Trot bows to us