Re: > 16TB worth of data question - Mailing list pgsql-general
From | Ron Johnson |
---|---|
Subject | Re: > 16TB worth of data question |
Date | |
Msg-id | 1051596099.16230.16.camel@haggis Whole thread Raw |
In response to | Re: > 16TB worth of data question ("scott.marlowe" <scott.marlowe@ihs.com>) |
Responses |
Re: > 16TB worth of data question
("Jim C. Nasby" <jim@nasby.net>)
|
List | pgsql-general |
On Mon, 2003-04-28 at 16:59, scott.marlowe wrote: > On 28 Apr 2003, Ron Johnson wrote: > [snip] > > On Mon, 2003-04-28 at 10:42, scott.marlowe wrote: > > > On 28 Apr 2003, Jeremiah Jahn wrote: > > > > > > > On Fri, 2003-04-25 at 16:46, Jan Wieck wrote: > > > > > Jeremiah Jahn wrote: > > > > > > > > > > > > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote: > > [snip] > > > Don't shut it down and backup at file system level, leave it up, restrict > > > access via pg_hba.conf if need be, and use pg_dump. File system level > > > backups are not the best way to go, although for quick recovery they can > > > be added to full pg_dumps as an aid, but don't leave out the pg_dump, > > > it's the way you're supposed to backup postgresql, and it can do so when > > > the database is "hot and in use" and provide a consistent backup > > > snapshot. > > > > What's the problem with doing a file-level backup of a *cold* database? > > There's no problem with doing it, the problem is that in order to get > anything back you pretty much have to have all of it to make it work > right, and any subtle problems of a partial copy might not be so obvious. > > Plus it sticks you to one major rev of the database. Pulling out five > year old copies of the base directory can involve a fair bit of work > getting an older flavor of postgresql to run on a newer os. Good point... [snip] > > The problem with pg_dump is that it's single-threaded, and it would take > > a whole lotta time to back up 16TB using 1 tape drive... > > But, you can run pg_dump against individual databases or tables on the > same postmaster, so you could theoretically write a script around pg_dump > to dump the databases or large tables to different drives. We backup our > main server to our backup server that way, albeit with only one backup > process at a time, since we can backup about a gig a minute, it's plenty > fast for us. If we needed to parallelize it that would be pretty easy. But pg doesn't guarantee internal consistency unless you pg_dump the database in one command "pg_dump db_name > db_yyyymmdd.dmp". Thus, no parallelism unless there are multiple databases, but if there's only 1 database... -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | An ad currently being run by the NEA (the US's biggest | | public school TEACHERS UNION) asks a teenager if he can | | find sodium and *chloride* in the periodic table of the | | elements. | | And they wonder why people think public schools suck... | +-----------------------------------------------------------+
pgsql-general by date: