Re: Inherited an 18TB DB & need to backup - Mailing list pgsql-general

From Ron
Subject Re: Inherited an 18TB DB & need to backup
Date
Msg-id 26c4b31e-063d-40c5-d74d-6b13918845c6@gmail.com
Whole thread Raw
In response to Re: Inherited an 18TB DB & need to backup  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
List pgsql-general
On 5/16/20 3:30 PM, Gavin Flower wrote:
On 17/05/2020 08:12, Ron wrote:
On 5/16/20 7:18 AM, Rob Sargent wrote:
O
Another problem is storage devices fail.  S3 storage lakes _should_ be checking your data integrity on a regular basis and possibly maintaining copies of it iin multiple locations so you're not vulnerable to a site disaster.
Tape FTW!!
Or WTF Tape??   :)

Tape is durable, long-lasting, high-density, under your control, can be taken off-site (don't underestimate the bandwidth of a station wagon full of tapes hurtling down the highway!) and -- with the proper software -- is multi-threaded.

Don't you mean multi-spooled??? :-)

That's a superset of multi-threaded IO.

Fascinating problem.  If the dump & load programs are designed to take a parameter for N drives for effective parallel operation, and N > 2, then things will run a lot faster.

I can think of several ways the the data can be dumped in parallel, with various trade-offs.  Would love to know how it's implemented in practice.

An OS with asynchronous, queued, non-blocking IO, and a programming language with callbacks.  OpenVMS has had it since since at least the early 1990s, and probably mid-1980s.  I remember backing up an Rdb/VMS database to 10 tape drives at the same time.  Typically, though, we "only" used six tape drives for that database, because we simultaneously backed up multiple databases.

--
Angular momentum makes the world go 'round.

pgsql-general by date:

Previous
From: Jessie Nava
Date:
Subject: Password reset
Next
From: "Peter J. Holzer"
Date:
Subject: Re: Password reset