Thread: > 16TB worth of data question

> 16TB worth of data question

From
Jeremiah Jahn
Date:
I have a system that will store about 2TB+ of images per year in a PG
database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
really sure what should be done here. Would life better to not store the
images as BLOBS, and instead come up with some complicated way to only
store the location in the database, or is there someway to have postgres
handle this somehow? What are other people out there doing about this
sort of thing?

thanx,
-jj-

--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Doug McNaught
Date:
Jeremiah Jahn <jeremiah@cs.earlham.edu> writes:

> I have a system that will store about 2TB+ of images per year in a PG
> database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> really sure what should be done here. Would life better to not store the
> images as BLOBS, and instead come up with some complicated way to only
> store the location in the database, or is there someway to have postgres
> handle this somehow? What are other people out there doing about this
> sort of thing?

Sounds to me as if you can ignore it for now.  16TB will last you at
least four years, at which time you'll be replacing hardware anyway
and can just buy 64-bit systems.

:)

-Doug

Re: > 16TB worth of data question

From
Neil Conway
Date:
On Mon, 2003-04-21 at 14:23, Jeremiah Jahn wrote:
> I have a system that will store about 2TB+ of images per year in a PG
> database. Linux unfortunatly has the 16TB limit for 32bit systems.

I'm not sure why the 16TB limit you refer to would be a problem in the
short to medium term, if you're only storing 2TB/year.

> Would life better to not store the
> images as BLOBS, and instead come up with some complicated way to only
> store the location in the database

FWIW, storing the FS location in the database is by no means
complicated. I'm personally not very fond of doing that, but not because
it's overly complex...

Cheers,

Neil


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
Notice the plus sign in that 2TB+. I have reason to belive that I could
hit that mark in a little over 2 years. What do you think the chances of
Linux 2.6 are of stablizing in that time frame.. ;) I'm just questioning
the use of BLOB's really. It would be nice if they could be spread over
multiple file systems.


-jj-

On Mon, 2003-04-21 at 13:28, Doug McNaught wrote:
> Jeremiah Jahn <jeremiah@cs.earlham.edu> writes:
>
> > I have a system that will store about 2TB+ of images per year in a PG
> > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > really sure what should be done here. Would life better to not store the
> > images as BLOBS, and instead come up with some complicated way to only
> > store the location in the database, or is there someway to have postgres
> > handle this somehow? What are other people out there doing about this
> > sort of thing?
>
> Sounds to me as if you can ignore it for now.  16TB will last you at
> least four years, at which time you'll be replacing hardware anyway
> and can just buy 64-bit systems.
>
> :)
>
> -Doug
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Paul Ramsey
Date:
Neil Conway wrote:
>
>>Would life better to not store the
>>images as BLOBS, and instead come up with some complicated way to only
>>store the location in the database
>
> FWIW, storing the FS location in the database is by no means
> complicated. I'm personally not very fond of doing that, but not because
> it's overly complex...

Heck, is the Pg BLOB interface anything more than a cute hook into
filesystem storage? I've often wondered what the point of BLOBs is,
beyond providing a single API to all the data (maybe that *is* the
point). Is there a performance advantage?

--
       __
      /
      | Paul Ramsey
      | Refractions Research
      | Email: pramsey@refractions.net
      \_


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
Sorry, for got a little info. A few years from now, I'd still like to be
able to luse my old servers for a bit of load balancing..

-jj-


On Mon, 2003-04-21 at 13:38, Neil Conway wrote:
> On Mon, 2003-04-21 at 14:23, Jeremiah Jahn wrote:
> > I have a system that will store about 2TB+ of images per year in a PG
> > database. Linux unfortunatly has the 16TB limit for 32bit systems.
>
> I'm not sure why the 16TB limit you refer to would be a problem in the
> short to medium term, if you're only storing 2TB/year.
>
> > Would life better to not store the
> > images as BLOBS, and instead come up with some complicated way to only
> > store the location in the database
>
> FWIW, storing the FS location in the database is by no means
> complicated. I'm personally not very fond of doing that, but not because
> it's overly complex...
>
> Cheers,
>
> Neil
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Doug McNaught
Date:
Paul Ramsey <pramsey@refractions.net> writes:

> Heck, is the Pg BLOB interface anything more than a cute hook into
> filesystem storage?

Yes; it's organized as a table and storage-managed the same way as
other tables.

> I've often wondered what the point of BLOBs is,
> beyond providing a single API to all the data (maybe that *is* the
> point).

That, plus when you back up the database you're backing up everything
you need, plus it's harder for the database and the 'filesystem' to
get out of sync (you can use triggers etc to make sure, and easily run
queries to make sure you don't have any dangling references).

> Is there a performance advantage?

Slight disadvantage, I should think (unless the individual-file
approach stupidly puts them all in one directory, which can be very
slow on some system).

-Doug

Re: > 16TB worth of data question

From
"Magnus Naeslund(f)"
Date:
Jeremiah Jahn <jeremiah@cs.earlham.edu> wrote:
> I have a system that will store about 2TB+ of images per year in a PG
> database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> really sure what should be done here. Would life better to not store
> the images as BLOBS, and instead come up with some complicated way to
> only store the location in the database, or is there someway to have
> postgres handle this somehow? What are other people out there doing
> about this sort of thing?
>

I have some experience in this matter.
We've made a system that contains about 5 million images today, and it's
about 2.5 TB. Our system works by having the metadata in a postgresql
database to be able to do complex searches, and it only has i filename
in _relative_ form like this:

11/2/33

and then the application server handling requests maps it into a real
file:

/storage/links/11/2/33.jpg

The links directory is just a indirection from the real storage
location, so "11" in this case can be /storage/nfs/server1/11.
Flexibility is the big win here. I can split partitions up any way i
want it and i don't have to care about any limits except maybe that the
"links" directory is getting too big. But since this is our application
we know that when the database is complete (20 million images+) it will
till have only 4000 directory entries in "links".

The blob stuff also sucks from other points of views.
It makes the database less manageable and also messes with lazy (we use
rsync) mirroring backups of the images, and also database dumps must be
huge? (i don't know how blobs are handled in pg_dump)

If it's nessecary to be able to handle this via postgresql, i would do a
serverside function that fetch the image from the filesystem and sends
it over, maybe using the bytea encoding?

I wouldn't go for the all-in-database approach ever in anything big.

Cheers
Magnus


Re: > 16TB worth of data question

From
"scott.marlowe"
Date:
On 21 Apr 2003, Jeremiah Jahn wrote:

> I have a system that will store about 2TB+ of images per year in a PG
> database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> really sure what should be done here. Would life better to not store the
> images as BLOBS, and instead come up with some complicated way to only
> store the location in the database, or is there someway to have postgres
> handle this somehow? What are other people out there doing about this
> sort of thing?

Then why not start right out on 64 bit systems?  Low end 64 bit sparcs
Ultra 60 type stuff) aren't too expensive, and debian and a few other
flavors seem to run quite quickly on Sparc hardware.

There's also 64 mainframe linux, and a couple of other 64 bit platforms
that are fairly mature, IBM's Power Pc based systems run Linux as well.

If you're gonna play with big datasets, that's the one time that 64 bit
really starts to have advantages, and let's face it, you're gonna go there
eventually anyway, might as well get a head start now.


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
The only issue with this is that it is difficult to recomend to our
clients who depend on bob and cuz'n joe to support their hardware. We
are kinda in the business of recomending the HW our clients need, and
not getting into the support side of it. Althoguh this might be a decent
option.


thanx,
-jj-

PS. My office already this whole black and silver motif going on, and
purplely blue would kinda clash.



On Mon, 2003-04-21 at 15:30, scott.marlowe wrote:
> On 21 Apr 2003, Jeremiah Jahn wrote:
>
> > I have a system that will store about 2TB+ of images per year in a PG
> > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > really sure what should be done here. Would life better to not store the
> > images as BLOBS, and instead come up with some complicated way to only
> > store the location in the database, or is there someway to have postgres
> > handle this somehow? What are other people out there doing about this
> > sort of thing?
>
> Then why not start right out on 64 bit systems?  Low end 64 bit sparcs
> Ultra 60 type stuff) aren't too expensive, and debian and a few other
> flavors seem to run quite quickly on Sparc hardware.
>
> There's also 64 mainframe linux, and a couple of other 64 bit platforms
> that are fairly mature, IBM's Power Pc based systems run Linux as well.
>
> If you're gonna play with big datasets, that's the one time that 64 bit
> really starts to have advantages, and let's face it, you're gonna go there
> eventually anyway, might as well get a head start now.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
"scott.marlowe"
Date:
Shame, there's an E10k with 64 CPUs selling for $24,000 with no opening
bid on ebay...  Ends in 10.

http://listings.ebay.com/pool2/plistings/endtoday/all/category41080/index.html?from=R11

Ummmmmmm 64 CPUs...

On 21 Apr 2003, Jeremiah Jahn wrote:

> The only issue with this is that it is difficult to recomend to our
> clients who depend on bob and cuz'n joe to support their hardware. We
> are kinda in the business of recomending the HW our clients need, and
> not getting into the support side of it. Althoguh this might be a decent
> option.
>
>
> thanx,
> -jj-
>
> PS. My office already this whole black and silver motif going on, and
> purplely blue would kinda clash.
>
>
>
> On Mon, 2003-04-21 at 15:30, scott.marlowe wrote:
> > On 21 Apr 2003, Jeremiah Jahn wrote:
> >
> > > I have a system that will store about 2TB+ of images per year in a PG
> > > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > > really sure what should be done here. Would life better to not store the
> > > images as BLOBS, and instead come up with some complicated way to only
> > > store the location in the database, or is there someway to have postgres
> > > handle this somehow? What are other people out there doing about this
> > > sort of thing?
> >
> > Then why not start right out on 64 bit systems?  Low end 64 bit sparcs
> > Ultra 60 type stuff) aren't too expensive, and debian and a few other
> > flavors seem to run quite quickly on Sparc hardware.
> >
> > There's also 64 mainframe linux, and a couple of other 64 bit platforms
> > that are fairly mature, IBM's Power Pc based systems run Linux as well.
> >
> > If you're gonna play with big datasets, that's the one time that 64 bit
> > really starts to have advantages, and let's face it, you're gonna go there
> > eventually anyway, might as well get a head start now.
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
>


Re: > 16TB worth of data question

From
"scott.marlowe"
Date:

Re: > 16TB worth of data question

From
Tom Lane
Date:
Jeremiah Jahn <jeremiah@cs.earlham.edu> writes:
> The only issue with this is that it is difficult to recomend to our
> clients who depend on bob and cuz'n joe to support their hardware.

And you expect them to be successful running a database that acquires
2TB+ of data per year?  I think you need to recalibrate your
expectations.  Consumer-grade junk PCs do not have the reliability
to make such a project even worth starting.  Run the database on decent
made-to-be-a-server hardware, or you'll regret it.

I think I've spent more time chasing various people's hardware failures
lately than I have in investigating real Postgres bugs.  I keep
volunteering to look at failures because I figure there are still some
data-loss bugs to be found, but I am coming to have a *real* low opinion
of off-the-shelf PC hardware.

            regards, tom lane


Re: > 16TB worth of data question

From
"Jim C. Nasby"
Date:
FWIW, FreeBSD's UFS filesystem doesn't suffer from the 2G limit. It's
limit is something like 4TB iirc.

I'll also second the opinion that if you're really going to put that
much data in your database, x86 hardware might not be a good idea.
However, I'd recommend RS/6000 over Sun if you can afford it; for
Oracle, the standard is that you only need 1/2 the CPUs in RS/6000 as
you would in a Sun, because RS/6000 puts heavy emphasis on memory
bandwidth.

On Mon, Apr 21, 2003 at 01:23:55PM -0500, Jeremiah Jahn wrote:
> I have a system that will store about 2TB+ of images per year in a PG
> database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> really sure what should be done here. Would life better to not store the
> images as BLOBS, and instead come up with some complicated way to only
> store the location in the database, or is there someway to have postgres
> handle this somehow? What are other people out there doing about this
> sort of thing?
>
> thanx,
> -jj-
>
> --
> Jeremiah Jahn <jeremiah@cs.earlham.edu>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
On Mon, 2003-04-21 at 20:43, Tom Lane wrote:
> Jeremiah Jahn <jeremiah@cs.earlham.edu> writes:
> > The only issue with this is that it is difficult to recomend to our
> > clients who depend on bob and cuz'n joe to support their hardware.
>
> And you expect them to be successful running a database that acquires
> 2TB+ of data per year?  I think you need to recalibrate your
> expectations.  Consumer-grade junk PCs do not have the reliability
> to make such a project even worth starting.  Run the database on decent
> made-to-be-a-server hardware, or you'll regret it.
That's the question...That 2 TB of data is nothing but documents and
images. I'm under the perception that if that gets parked on a fibre
channel disk array/ SAN the data will be pretty safe, and the server
mostly replaceable at that time. Storage is my worry more than
processing power. I don't think I'm on crack here...?


>
> I think I've spent more time chasing various people's hardware failures
> lately than I have in investigating real Postgres bugs.  I keep
> volunteering to look at failures because I figure there are still some
> data-loss bugs to be found, but I am coming to have a *real* low opinion
> of off-the-shelf PC hardware.
>
>             regards, tom lane
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Lincoln Yeoh
Date:
At 08:13 AM 4/22/2003 -0500, Jeremiah Jahn wrote:
>That's the question...That 2 TB of data is nothing but documents and
>images. I'm under the perception that if that gets parked on a fibre
>channel disk array/ SAN the data will be pretty safe, and the server
>mostly replaceable at that time. Storage is my worry more than
>processing power. I don't think I'm on crack here...?

How about backups? Backing up 2-16TB needs a bit more planning and design.

> > I think I've spent more time chasing various people's hardware failures
> > lately than I have in investigating real Postgres bugs.  I keep
> > volunteering to look at failures because I figure there are still some
> > data-loss bugs to be found, but I am coming to have a *real* low opinion
> > of off-the-shelf PC hardware.
> >
> >                       regards, tom lane

Well there's off-the-shelf x86 desktop PC hardware, and there's
off-the-shelf x86 server hardware.

In my experience for name brands it's about 1/7 DOA for the former, and 0
for the latter. There's a big diff in dependability. The latter tend to
keep running for many years, typically being retired for no fault of their own.

Then there are users who stick no name memory into their servers or "servers".

Regards,
Link.

Re: > 16TB worth of data question

From
Jonathan Bartlett
Date:
> > And you expect them to be successful running a database that acquires
> > 2TB+ of data per year?  I think you need to recalibrate your
> > expectations.  Consumer-grade junk PCs do not have the reliability
> > to make such a project even worth starting.  Run the database on decent
> > made-to-be-a-server hardware, or you'll regret it.
> That's the question...That 2 TB of data is nothing but documents and
> images. I'm under the perception that if that gets parked on a fibre
> channel disk array/ SAN the data will be pretty safe, and the server
> mostly replaceable at that time. Storage is my worry more than
> processing power. I don't think I'm on crack here...?

Actually, true server-grade machines usually have _less_ processing power
than PCs.  Why?  Because their attention is paid to reliability, not
speed.  Server class (ie. name-brand, ECC) memory is a must.
Previous-generation processors are a must, and someone who knows something
about server hardware putting it together is a must.  If you don't have
these things, you are bound for server lockups, memory corruption, and all
sorts of other evils.

Microsoft gets a lot of blame for the failures of the PC, but in reality,
a lot of them are hardware-related.

Jon

>
>
> >
> > I think I've spent more time chasing various people's hardware failures
> > lately than I have in investigating real Postgres bugs.  I keep
> > volunteering to look at failures because I figure there are still some
> > data-loss bugs to be found, but I am coming to have a *real* low opinion
> > of off-the-shelf PC hardware.
> >
> >             regards, tom lane
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> --
> Jeremiah Jahn <jeremiah@cs.earlham.edu>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>


Re: > 16TB worth of data question

From
Jonathan Bartlett
Date:
> FWIW, FreeBSD's UFS filesystem doesn't suffer from the 2G limit. It's
> limit is something like 4TB iirc.

If the data itself is large (i.e. - large images not just large # of
images), I'd go with an XFS filesystem.  It has a 5 petabyte limit, and
for large sequential reads, it has almost full-disk speed.

Jon

>
> I'll also second the opinion that if you're really going to put that
> much data in your database, x86 hardware might not be a good idea.
> However, I'd recommend RS/6000 over Sun if you can afford it; for
> Oracle, the standard is that you only need 1/2 the CPUs in RS/6000 as
> you would in a Sun, because RS/6000 puts heavy emphasis on memory
> bandwidth.
>
> On Mon, Apr 21, 2003 at 01:23:55PM -0500, Jeremiah Jahn wrote:
> > I have a system that will store about 2TB+ of images per year in a PG
> > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > really sure what should be done here. Would life better to not store the
> > images as BLOBS, and instead come up with some complicated way to only
> > store the location in the database, or is there someway to have postgres
> > handle this somehow? What are other people out there doing about this
> > sort of thing?
> >
> > thanx,
> > -jj-
> >
> > --
> > Jeremiah Jahn <jeremiah@cs.earlham.edu>
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
> --
> Jim C. Nasby (aka Decibel!)                    jim@nasby.net
> Member: Triangle Fraternity, Sports Car Club of America
> Give your computer some brain candy! www.distributed.net Team #1828
>
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> At 08:13 AM 4/22/2003 -0500, Jeremiah Jahn wrote:
> >That's the question...That 2 TB of data is nothing but documents and
> >images. I'm under the perception that if that gets parked on a fibre
> >channel disk array/ SAN the data will be pretty safe, and the server
> >mostly replaceable at that time. Storage is my worry more than
> >processing power. I don't think I'm on crack here...?
>
> How about backups? Backing up 2-16TB needs a bit more planning and design.
The proposed solution here is to have the raid controller mirror accross
the street to a similar system. At what point do off line backups become
pointless?

>
> > > I think I've spent more time chasing various people's hardware failures
> > > lately than I have in investigating real Postgres bugs.  I keep
> > > volunteering to look at failures because I figure there are still some
> > > data-loss bugs to be found, but I am coming to have a *real* low opinion
> > > of off-the-shelf PC hardware.
> > >
> > >                       regards, tom lane
>
> Well there's off-the-shelf x86 desktop PC hardware, and there's
> off-the-shelf x86 server hardware.
>
> In my experience for name brands it's about 1/7 DOA for the former, and 0
> for the latter. There's a big diff in dependability. The latter tend to
> keep running for many years, typically being retired for no fault of their own.
>
> Then there are users who stick no name memory into their servers or "servers".
>
> Regards,
> Link.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
JFS is the same, however all Linux filsystems are limit to 16TB on 32bit
HW. Due to the page caching code. That's with the 2.5 kernel. The 2.4 is
4TB iirc.


On Tue, 2003-04-22 at 11:07, Jonathan Bartlett wrote:
> > FWIW, FreeBSD's UFS filesystem doesn't suffer from the 2G limit. It's
> > limit is something like 4TB iirc.
>
> If the data itself is large (i.e. - large images not just large # of
> images), I'd go with an XFS filesystem.  It has a 5 petabyte limit, and
> for large sequential reads, it has almost full-disk speed.
>
> Jon
>
> >
> > I'll also second the opinion that if you're really going to put that
> > much data in your database, x86 hardware might not be a good idea.
> > However, I'd recommend RS/6000 over Sun if you can afford it; for
> > Oracle, the standard is that you only need 1/2 the CPUs in RS/6000 as
> > you would in a Sun, because RS/6000 puts heavy emphasis on memory
> > bandwidth.
> >
> > On Mon, Apr 21, 2003 at 01:23:55PM -0500, Jeremiah Jahn wrote:
> > > I have a system that will store about 2TB+ of images per year in a PG
> > > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > > really sure what should be done here. Would life better to not store the
> > > images as BLOBS, and instead come up with some complicated way to only
> > > store the location in the database, or is there someway to have postgres
> > > handle this somehow? What are other people out there doing about this
> > > sort of thing?
> > >
> > > thanx,
> > > -jj-
> > >
> > > --
> > > Jeremiah Jahn <jeremiah@cs.earlham.edu>
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > >
> >
> > --
> > Jim C. Nasby (aka Decibel!)                    jim@nasby.net
> > Member: Triangle Fraternity, Sports Car Club of America
> > Give your computer some brain candy! www.distributed.net Team #1828
> >
> > Windows: "Where do you want to go today?"
> > Linux: "Where do you want to go tomorrow?"
> > FreeBSD: "Are you guys coming, or what?"
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Philip Hallstrom
Date:
> On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > At 08:13 AM 4/22/2003 -0500, Jeremiah Jahn wrote:
> > >That's the question...That 2 TB of data is nothing but documents and
> > >images. I'm under the perception that if that gets parked on a fibre
> > >channel disk array/ SAN the data will be pretty safe, and the server
> > >mostly replaceable at that time. Storage is my worry more than
> > >processing power. I don't think I'm on crack here...?
> >
> > How about backups? Backing up 2-16TB needs a bit more planning and design.
> The proposed solution here is to have the raid controller mirror accross
> the street to a similar system. At what point do off line backups become
> pointless?

I've always been told to ask the question -- what's your data worth to
you or how much $ will you lose if you lose your data?  That should answer
the question on offline backups.

-philip


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
On Tue, 2003-04-22 at 11:47, Philip Hallstrom wrote:
> > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > > At 08:13 AM 4/22/2003 -0500, Jeremiah Jahn wrote:
> > > >That's the question...That 2 TB of data is nothing but documents and
> > > >images. I'm under the perception that if that gets parked on a fibre
> > > >channel disk array/ SAN the data will be pretty safe, and the server
> > > >mostly replaceable at that time. Storage is my worry more than
> > > >processing power. I don't think I'm on crack here...?
> > >
> > > How about backups? Backing up 2-16TB needs a bit more planning and design.
> > The proposed solution here is to have the raid controller mirror accross
> > the street to a similar system. At what point do off line backups become
> > pointless?
>
> I've always been told to ask the question -- what's your data worth to
> you or how much $ will you lose if you lose your data?  That should answer
> the question on offline backups.
>
> -philip

It's for the court system, so not much really. It basically a convience
only.

--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
Dennis Gearon
Date:
One thing to think of, is how much is your REPUTATION worth? If everything you touch, turns to gold, keeps working,
raisesand better equipment are a LOT easier to get. OTOH................ 

Jeremiah Jahn wrote:
> On Tue, 2003-04-22 at 11:47, Philip Hallstrom wrote:
>
>>>On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
>>>
>>>>At 08:13 AM 4/22/2003 -0500, Jeremiah Jahn wrote:
>>>>
>>>>>That's the question...That 2 TB of data is nothing but documents and
>>>>>images. I'm under the perception that if that gets parked on a fibre
>>>>>channel disk array/ SAN the data will be pretty safe, and the server
>>>>>mostly replaceable at that time. Storage is my worry more than
>>>>>processing power. I don't think I'm on crack here...?
>>>>
>>>>How about backups? Backing up 2-16TB needs a bit more planning and design.
>>>
>>>The proposed solution here is to have the raid controller mirror accross
>>>the street to a similar system. At what point do off line backups become
>>>pointless?
>>
>>I've always been told to ask the question -- what's your data worth to
>>you or how much $ will you lose if you lose your data?  That should answer
>>the question on offline backups.
>>
>>-philip
>
>
> It's for the court system, so not much really. It basically a convience
> only.
>


Re: > 16TB worth of data question

From
Sean Chittenden
Date:
> FWIW, FreeBSD's UFS filesystem doesn't suffer from the 2G
> limit. It's limit is something like 4TB iirc.

Actually, on FreeBSD >5.X, you have UFS2 which shatters the 4TB limit.
Now you're limited to 16,777,216TB per file system.  Sorry it's not
more.  In truth, I wish that limit was higher as I seem to run into it
on a daily basis.  ::grumps::         8-]

> I'll also second the opinion that if you're really going to put that
> much data in your database, x86 hardware might not be a good idea.
> However, I'd recommend RS/6000 over Sun if you can afford it;

There really is something to be said for the architecture of PowerPC's
memory access routines and its scalability... though quirky to build
open source stuff on, AIX is still the grand daddy of *NIXes for this
very reason.

-sc

--
Sean Chittenden


Re: > 16TB worth of data question

From
Jonathan Bartlett
Date:
> > How about backups? Backing up 2-16TB needs a bit more planning and design.
> The proposed solution here is to have the raid controller mirror accross
> the street to a similar system. At what point do off line backups become
> pointless?

This does not help if someone rm -R's the whole thing.

If you don't think that's possible, it happened to Travelocity one day.
The DBA thought that a tablespace was no longer in use and rm'd it.  Ooops.
Entire database down and the tablespace had to be restored from backup.


>
> >
> > > > I think I've spent more time chasing various people's hardware failures
> > > > lately than I have in investigating real Postgres bugs.  I keep
> > > > volunteering to look at failures because I figure there are still some
> > > > data-loss bugs to be found, but I am coming to have a *real* low opinion
> > > > of off-the-shelf PC hardware.
> > > >
> > > >                       regards, tom lane
> >
> > Well there's off-the-shelf x86 desktop PC hardware, and there's
> > off-the-shelf x86 server hardware.
> >
> > In my experience for name brands it's about 1/7 DOA for the former, and 0
> > for the latter. There's a big diff in dependability. The latter tend to
> > keep running for many years, typically being retired for no fault of their own.
> >
> > Then there are users who stick no name memory into their servers or "servers".
> >
> > Regards,
> > Link.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> --
> Jeremiah Jahn <jeremiah@cs.earlham.edu>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>


Backing up 16TB of data (was Re: > 16TB worth of data question)

From
Ron Johnson
Date:
On Mon, 2003-04-21 at 13:23, Jeremiah Jahn wrote:
> I have a system that will store about 2TB+ of images per year in a PG
> database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> really sure what should be done here. Would life better to not store the
> images as BLOBS, and instead come up with some complicated way to only
> store the location in the database, or is there someway to have postgres
> handle this somehow? What are other people out there doing about this
> sort of thing?

Now that the hard disk and file system issues have been hashed around,
have you thought about how you are going to back up this much data?

You'll need multiple modern tape drives (like SuperDLT2, that can
store 160GB *raw* per tape), and having one SCSI controller per
drive, so you don't starve each. drive.

You might also think about daisy chaining a set of mini-libraries.

Then, for performance, you'd also need multiple 66MHz x 64 bit PCI
bridges so each set of 6 PCI slots doesn't get bandwidth constrained.
If PCI-X is shipping, I'd look at that.

The bottom line is that this is going to be a very big box, or
a lot of mid-sized boxes.

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| An ad currently being run by the NEA (the US's biggest    |
| public school TEACHERS UNION) asks a teenager if he can   |
| find sodium and *chloride* in the periodic table of the   |
| elements.                                                 |
| And they wonder why people think public schools suck...   |
+-----------------------------------------------------------+


Re: > 16TB worth of data question

From
Andrew Sullivan
Date:
On Tue, Apr 22, 2003 at 08:13:50AM -0500, Jeremiah Jahn wrote:

> That's the question...That 2 TB of data is nothing but documents and
> images. I'm under the perception that if that gets parked on a fibre
> channel disk array/ SAN the data will be pretty safe, and the server
> mostly replaceable at that time. Storage is my worry more than
> processing power. I don't think I'm on crack here...?

"Processing power" isn't automatically what server-class hardware
will get you.

You need good ECC memory.  You need good disk controllers.  You need
to make sure that random flakey garbage hardware will not write
random flakey garbage all over your high quality disk array.  Speed
may not be an issue, but reliability really is.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: > 16TB worth of data question

From
Jan Wieck
Date:
Jeremiah Jahn wrote:
>
> On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > How about backups? Backing up 2-16TB needs a bit more planning and design.
> The proposed solution here is to have the raid controller mirror accross
> the street to a similar system. At what point do off line backups become
> pointless?

Fine, so you have a 10 MBit link across the "street" in the budget but
nothing to buy decent server hardware? 4TB (just to be around where you
expect to be after 6 months or so) per year means an average MBit per
second ... on a second thought, the 10 MBit will probably not be enough
to satisfy peak times.

Who made that mirror proposal and where is your PITR capable backup? I
once saved a company by doing PITR. They ran an Oracle database on two
software mirrored Raid5 systems ... and the application was
misconfigured and deleted too much data (way too much).


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
Jan Wieck
Date:
Ron Johnson wrote:
>
> On Mon, 2003-04-21 at 13:23, Jeremiah Jahn wrote:
> > I have a system that will store about 2TB+ of images per year in a PG
> > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > really sure what should be done here. Would life better to not store the
> > images as BLOBS, and instead come up with some complicated way to only
> > store the location in the database, or is there someway to have postgres
> > handle this somehow? What are other people out there doing about this
> > sort of thing?
>
> Now that the hard disk and file system issues have been hashed around,
> have you thought about how you are going to back up this much data?

Legato had shown a couple years ago already that Networker can backup
more than a Terabyte per hour. They used an RS6000 with over 100 disks
and 36 DLT 7000 drives on 16 controllers if I recall correctly ... not
your average backup solution but it's possible. But I doubt one can
configure something like this with x86 hardware.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
"scott.marlowe"
Date:
On Fri, 25 Apr 2003, Jan Wieck wrote:

> Ron Johnson wrote:
> >
> > On Mon, 2003-04-21 at 13:23, Jeremiah Jahn wrote:
> > > I have a system that will store about 2TB+ of images per year in a PG
> > > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > > really sure what should be done here. Would life better to not store the
> > > images as BLOBS, and instead come up with some complicated way to only
> > > store the location in the database, or is there someway to have postgres
> > > handle this somehow? What are other people out there doing about this
> > > sort of thing?
> >
> > Now that the hard disk and file system issues have been hashed around,
> > have you thought about how you are going to back up this much data?
>
> Legato had shown a couple years ago already that Networker can backup
> more than a Terabyte per hour. They used an RS6000 with over 100 disks
> and 36 DLT 7000 drives on 16 controllers if I recall correctly ... not
> your average backup solution but it's possible. But I doubt one can
> configure something like this with x86 hardware.

I'm sure you could, but it might well involve 12 PII-350's running a trio
of DLTs each, with a RAID array for caching. :-)


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
Jan Wieck
Date:
"scott.marlowe" wrote:
>
> On Fri, 25 Apr 2003, Jan Wieck wrote:
>
> > Ron Johnson wrote:
> > >
> > > On Mon, 2003-04-21 at 13:23, Jeremiah Jahn wrote:
> > > > I have a system that will store about 2TB+ of images per year in a PG
> > > > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > > > really sure what should be done here. Would life better to not store the
> > > > images as BLOBS, and instead come up with some complicated way to only
> > > > store the location in the database, or is there someway to have postgres
> > > > handle this somehow? What are other people out there doing about this
> > > > sort of thing?
> > >
> > > Now that the hard disk and file system issues have been hashed around,
> > > have you thought about how you are going to back up this much data?
> >
> > Legato had shown a couple years ago already that Networker can backup
> > more than a Terabyte per hour. They used an RS6000 with over 100 disks
> > and 36 DLT 7000 drives on 16 controllers if I recall correctly ... not
> > your average backup solution but it's possible. But I doubt one can
> > configure something like this with x86 hardware.
>
> I'm sure you could, but it might well involve 12 PII-350's running a trio
> of DLTs each, with a RAID array for caching. :-)

How many bits per second is a Terabyte per hour? One TB is 8TBit (plus
protocol overhead and so on and so forth but let's ignore that for this
discussion), we're talking about at least 8,000 GBit ... and one hour
has 3.6 Kiloseconds ... making more than 2 GBit per second ... so there
are 3 of these PC's per Gigabit segment and the DB server has a 4x
Gigabit card just for that, cool!

Of course, assuming we want to backup the total 24 Terabyte he has in
2-3 years in less than a day, if we have a month to take a backup we can
save some money on the backup solution.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
Ron Johnson
Date:
On Fri, 2003-04-25 at 17:20, Jan Wieck wrote:
> Ron Johnson wrote:
> >
> > On Mon, 2003-04-21 at 13:23, Jeremiah Jahn wrote:
> > > I have a system that will store about 2TB+ of images per year in a PG
> > > database. Linux unfortunatly has the 16TB limit for 32bit systems. Not
> > > really sure what should be done here. Would life better to not store the
> > > images as BLOBS, and instead come up with some complicated way to only
> > > store the location in the database, or is there someway to have postgres
> > > handle this somehow? What are other people out there doing about this
> > > sort of thing?
> >
> > Now that the hard disk and file system issues have been hashed around,
> > have you thought about how you are going to back up this much data?
>
> Legato had shown a couple years ago already that Networker can backup
> more than a Terabyte per hour. They used an RS6000 with over 100 disks
> and 36 DLT 7000 drives on 16 controllers if I recall correctly ... not
> your average backup solution but it's possible. But I doubt one can
> configure something like this with x86 hardware.

And we're running something similar, though not quite as extravagant,
on an AlphaServer GS320.  SuperDLT 320 is *really* fast...  (But you
gotta keep them fed, and that's where x86 might choke.)

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| An ad currently being run by the NEA (the US's biggest    |
| public school TEACHERS UNION) asks a teenager if he can   |
| find sodium and *chloride* in the periodic table of the   |
| elements.                                                 |
| And they wonder why people think public schools suck...   |
+-----------------------------------------------------------+


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
Lincoln Yeoh
Date:
At 07:32 PM 4/25/2003 -0400, Jan Wieck wrote:
>Of course, assuming we want to backup the total 24 Terabyte he has in
>2-3 years in less than a day, if we have a month to take a backup we can
>save some money on the backup solution.

If you take the month approach, the filesystem/db snapshot has to stay in
place for a full month whilst the backup is occuring. Do-able, but can be
stressful e.g. if something goes wrong after trying for a full month...

But anyway he said off-line backups aren't that important - I gather that
recreating the data is not totally impractical. Still 16TB, ouch.

Regards,
Link.

Re: Backing up 16TB of data (was Re: > 16TB worth of

From
Jan Wieck
Date:
Lincoln Yeoh wrote:
>
> At 07:32 PM 4/25/2003 -0400, Jan Wieck wrote:
> >Of course, assuming we want to backup the total 24 Terabyte he has in
> >2-3 years in less than a day, if we have a month to take a backup we can
> >save some money on the backup solution.
>
> If you take the month approach, the filesystem/db snapshot has to stay in
> place for a full month whilst the backup is occuring. Do-able, but can be
> stressful e.g. if something goes wrong after trying for a full month...
>
> But anyway he said off-line backups aren't that important - I gather that
> recreating the data is not totally impractical. Still 16TB, ouch.

I think a scenario like that, where one has a relatively small
percentage of really updated data and a huge portion of constantly
growing, is a good example for when it might be appropriate NOT to store
everything in one database.

Depending on the rest of the requirements, it might need 2PC and using 2
databases. The updated database then could be backed up by normal means
while for the constantly growing one you just archive the redo logs.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
Ron Johnson
Date:
On Sat, 2003-04-26 at 19:18, Jan Wieck wrote:
> Lincoln Yeoh wrote:
> >
> > At 07:32 PM 4/25/2003 -0400, Jan Wieck wrote:
> > >Of course, assuming we want to backup the total 24 Terabyte he has in
> > >2-3 years in less than a day, if we have a month to take a backup we can
> > >save some money on the backup solution.
> >
> > If you take the month approach, the filesystem/db snapshot has to stay in
> > place for a full month whilst the backup is occuring. Do-able, but can be
> > stressful e.g. if something goes wrong after trying for a full month...
> >
> > But anyway he said off-line backups aren't that important - I gather that
> > recreating the data is not totally impractical. Still 16TB, ouch.
>
> I think a scenario like that, where one has a relatively small
> percentage of really updated data and a huge portion of constantly
> growing, is a good example for when it might be appropriate NOT to store
> everything in one database.
>
> Depending on the rest of the requirements, it might need 2PC and using 2
> databases. The updated database then could be backed up by normal means
> while for the constantly growing one you just archive the redo logs.

Here's another thought: does all 16TB of data *really* have to be
in the database all the time??

Maybe there's a business rule that anything older than 6 months isn't
in the database itself, but is "just" sitting out on somewhere on a
filesystem, and if the old data is requested, then, either
programmatically or thru operator intervention, the old data is copied
into the database.

Yes, it would take longer to access "old" data, but, hey, that's
reality (unless you want to spend *really*large* amounts of money).
And it's not an "OSS vs.Proprietary" either.

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| An ad currently being run by the NEA (the US's biggest    |
| public school TEACHERS UNION) asks a teenager if he can   |
| find sodium and *chloride* in the periodic table of the   |
| elements.                                                 |
| And they wonder why people think public schools suck...   |
+-----------------------------------------------------------+


Re: Backing up 16TB of data (was Re: > 16TB worth of

From
"Fred Moyer"
Date:
Here's a 70 TB Backup RAID at the University of Tübingen that should be
able handle the needs adequately.

http://www.tomshardware.com/storage/20030425/index.html

> On Sat, 2003-04-26 at 19:18, Jan Wieck wrote:
>> Lincoln Yeoh wrote:
>> >
>> > At 07:32 PM 4/25/2003 -0400, Jan Wieck wrote:
>> > >Of course, assuming we want to backup the total 24 Terabyte he has
>> in 2-3 years in less than a day, if we have a month to take a
>> backup we can save some money on the backup solution.
>> >
>> > If you take the month approach, the filesystem/db snapshot has to
>> stay in place for a full month whilst the backup is occuring.
>> Do-able, but can be stressful e.g. if something goes wrong after
>> trying for a full month...
>> >
>> > But anyway he said off-line backups aren't that important - I gather
>> that recreating the data is not totally impractical. Still 16TB,
>> ouch.
>>
>> I think a scenario like that, where one has a relatively small
>> percentage of really updated data and a huge portion of constantly
>> growing, is a good example for when it might be appropriate NOT to
>> store everything in one database.
>>
>> Depending on the rest of the requirements, it might need 2PC and using
>> 2 databases. The updated database then could be backed up by normal
>> means while for the constantly growing one you just archive the redo
>> logs.
>
> Here's another thought: does all 16TB of data *really* have to be
> in the database all the time??
>
> Maybe there's a business rule that anything older than 6 months isn't in
> the database itself, but is "just" sitting out on somewhere on a
> filesystem, and if the old data is requested, then, either
> programmatically or thru operator intervention, the old data is copied
> into the database.
>
> Yes, it would take longer to access "old" data, but, hey, that's
> reality (unless you want to spend *really*large* amounts of money). And
> it's not an "OSS vs.Proprietary" either.


Re: > 16TB worth of data question

From
Jeremiah Jahn
Date:
On Fri, 2003-04-25 at 16:46, Jan Wieck wrote:
> Jeremiah Jahn wrote:
> >
> > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > > How about backups? Backing up 2-16TB needs a bit more planning and design.
> > The proposed solution here is to have the raid controller mirror accross
> > the street to a similar system. At what point do off line backups become
> > pointless?
>
> Fine, so you have a 10 MBit link across the "street" in the budget but
> nothing to buy decent server hardware? 4TB (just to be around where you
> expect to be after 6 months or so) per year means an average MBit per
> second ... on a second thought, the 10 MBit will probably not be enough
> to satisfy peak times.
actually they have fiber accross the street. So the SAN can just move
stuff right across at 2Gbps.


>
> Who made that mirror proposal and where is your PITR capable backup? I
> once saved a company by doing PITR. They ran an Oracle database on two
> software mirrored Raid5 systems ... and the application was
> misconfigured and deleted too much data (way too much).

I don't belive that postgres does PITR yet. However, this is a system
that doesn't need to be 24/7, so we can shut down the db and dump the
whole thing to tape every day  with no problem.

>
>
> Jan
--
Jeremiah Jahn <jeremiah@cs.earlham.edu>


Re: > 16TB worth of data question

From
"scott.marlowe"
Date:
On 28 Apr 2003, Jeremiah Jahn wrote:

> On Fri, 2003-04-25 at 16:46, Jan Wieck wrote:
> > Jeremiah Jahn wrote:
> > >
> > > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > > > How about backups? Backing up 2-16TB needs a bit more planning and design.
> > > The proposed solution here is to have the raid controller mirror accross
> > > the street to a similar system. At what point do off line backups become
> > > pointless?
> >
> > Fine, so you have a 10 MBit link across the "street" in the budget but
> > nothing to buy decent server hardware? 4TB (just to be around where you
> > expect to be after 6 months or so) per year means an average MBit per
> > second ... on a second thought, the 10 MBit will probably not be enough
> > to satisfy peak times.
> actually they have fiber accross the street. So the SAN can just move
> stuff right across at 2Gbps.
>
>
> >
> > Who made that mirror proposal and where is your PITR capable backup? I
> > once saved a company by doing PITR. They ran an Oracle database on two
> > software mirrored Raid5 systems ... and the application was
> > misconfigured and deleted too much data (way too much).
>
> I don't belive that postgres does PITR yet. However, this is a system
> that doesn't need to be 24/7, so we can shut down the db and dump the
> whole thing to tape every day  with no problem.

Don't shut it down and backup at file system level, leave it up, restrict
access via pg_hba.conf if need be, and use pg_dump.  File system level
backups are not the best way to go, although for quick recovery they can
be added to full pg_dumps as an aid, but don't leave out the pg_dump,
it's the way you're supposed to backup postgresql, and it can do so when
the database is "hot and in use" and provide a consistent backup
snapshot.


Re: > 16TB worth of data question

From
Ron Johnson
Date:
On Mon, 2003-04-28 at 10:42, scott.marlowe wrote:
> On 28 Apr 2003, Jeremiah Jahn wrote:
>
> > On Fri, 2003-04-25 at 16:46, Jan Wieck wrote:
> > > Jeremiah Jahn wrote:
> > > >
> > > > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
[snip]
> Don't shut it down and backup at file system level, leave it up, restrict
> access via pg_hba.conf if need be, and use pg_dump.  File system level
> backups are not the best way to go, although for quick recovery they can
> be added to full pg_dumps as an aid, but don't leave out the pg_dump,
> it's the way you're supposed to backup postgresql, and it can do so when
> the database is "hot and in use" and provide a consistent backup
> snapshot.

What's the problem with doing a file-level backup of a *cold* database?

The problem with pg_dump is that it's single-threaded, and it would take
a whole lotta time to back up 16TB using 1 tape drive...

If a pg database is spread across multiple devices (using symlinks),
then a cold database can be backed up, at the file level, using
multiple tape drives.  (Of course, all symlinks would have to be
recreated when/if the database files had to be restored.)

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| An ad currently being run by the NEA (the US's biggest    |
| public school TEACHERS UNION) asks a teenager if he can   |
| find sodium and *chloride* in the periodic table of the   |
| elements.                                                 |
| And they wonder why people think public schools suck...   |
+-----------------------------------------------------------+


Re: > 16TB worth of data question

From
"scott.marlowe"
Date:
On 28 Apr 2003, Ron Johnson wrote:

> On Mon, 2003-04-28 at 10:42, scott.marlowe wrote:
> > On 28 Apr 2003, Jeremiah Jahn wrote:
> >
> > > On Fri, 2003-04-25 at 16:46, Jan Wieck wrote:
> > > > Jeremiah Jahn wrote:
> > > > >
> > > > > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> [snip]
> > Don't shut it down and backup at file system level, leave it up, restrict
> > access via pg_hba.conf if need be, and use pg_dump.  File system level
> > backups are not the best way to go, although for quick recovery they can
> > be added to full pg_dumps as an aid, but don't leave out the pg_dump,
> > it's the way you're supposed to backup postgresql, and it can do so when
> > the database is "hot and in use" and provide a consistent backup
> > snapshot.
>
> What's the problem with doing a file-level backup of a *cold* database?

There's no problem with doing it, the problem is that in order to get
anything back you pretty much have to have all of it to make it work
right, and any subtle problems of a partial copy might not be so obvious.

Plus it sticks you to one major rev of the database.  Pulling out five
year old copies of the base directory can involve a fair bit of work
getting an older flavor of postgresql to run on a newer os.

It's not that it's wrong, it's that it should be considered carefully
before being done.  Note that a month or so ago someone had veritas
backing up their postgresql database while it was still live and it was
corrupting data.  Admitted, you're gonna bring down the database first,
so it's not an issue for you.

> The problem with pg_dump is that it's single-threaded, and it would take
> a whole lotta time to back up 16TB using 1 tape drive...

But, you can run pg_dump against individual databases or tables on the
same postmaster, so you could theoretically write a script around pg_dump
to dump the databases or large tables to different drives.  We backup our
main server to our backup server that way, albeit with only one backup
process at a time, since we can backup about a gig a minute, it's plenty
fast for us.  If we needed to parallelize it that would be pretty easy.

> If a pg database is spread across multiple devices (using symlinks),
> then a cold database can be backed up, at the file level, using
> multiple tape drives.  (Of course, all symlinks would have to be
> recreated when/if the database files had to be restored.)

The same thing can be done with multiple pg_dumps running against the
databases / tables you select.  While the database is still up.  It's a
toss up, but know that the pg_dump format is the "supported" method, so to
speak, for backing up.

Hopefully before it becomes a huge issue PITR will be done, eh?


Re: > 16TB worth of data question

From
Ron Johnson
Date:
On Mon, 2003-04-28 at 16:59, scott.marlowe wrote:
> On 28 Apr 2003, Ron Johnson wrote:
> [snip]
> > On Mon, 2003-04-28 at 10:42, scott.marlowe wrote:
> > > On 28 Apr 2003, Jeremiah Jahn wrote:
> > >
> > > > On Fri, 2003-04-25 at 16:46, Jan Wieck wrote:
> > > > > Jeremiah Jahn wrote:
> > > > > >
> > > > > > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > [snip]
> > > Don't shut it down and backup at file system level, leave it up, restrict
> > > access via pg_hba.conf if need be, and use pg_dump.  File system level
> > > backups are not the best way to go, although for quick recovery they can
> > > be added to full pg_dumps as an aid, but don't leave out the pg_dump,
> > > it's the way you're supposed to backup postgresql, and it can do so when
> > > the database is "hot and in use" and provide a consistent backup
> > > snapshot.
> >
> > What's the problem with doing a file-level backup of a *cold* database?
>
> There's no problem with doing it, the problem is that in order to get
> anything back you pretty much have to have all of it to make it work
> right, and any subtle problems of a partial copy might not be so obvious.
>
> Plus it sticks you to one major rev of the database.  Pulling out five
> year old copies of the base directory can involve a fair bit of work
> getting an older flavor of postgresql to run on a newer os.

Good point...

[snip]
> > The problem with pg_dump is that it's single-threaded, and it would take
> > a whole lotta time to back up 16TB using 1 tape drive...
>
> But, you can run pg_dump against individual databases or tables on the
> same postmaster, so you could theoretically write a script around pg_dump
> to dump the databases or large tables to different drives.  We backup our
> main server to our backup server that way, albeit with only one backup
> process at a time, since we can backup about a gig a minute, it's plenty
> fast for us.  If we needed to parallelize it that would be pretty easy.

But pg doesn't guarantee internal consistency unless you pg_dump
the database in one command "pg_dump db_name > db_yyyymmdd.dmp".

Thus, no parallelism unless there are multiple databases, but if there's
only 1 database...

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| An ad currently being run by the NEA (the US's biggest    |
| public school TEACHERS UNION) asks a teenager if he can   |
| find sodium and *chloride* in the periodic table of the   |
| elements.                                                 |
| And they wonder why people think public schools suck...   |
+-----------------------------------------------------------+


Re: > 16TB worth of data question

From
"Jim C. Nasby"
Date:
On Tue, Apr 29, 2003 at 01:01:39AM -0500, Ron Johnson wrote:
> But pg doesn't guarantee internal consistency unless you pg_dump
> the database in one command "pg_dump db_name > db_yyyymmdd.dmp".
>
> Thus, no parallelism unless there are multiple databases, but if there's
> only 1 database...

It would probably be useful if you could feed a specific transaction
id/timestamp to pg_dump, so that it will take the snapshot as of that
time. Of course, it'd probably be easiest if you could just tell pg_dump
to use X number of threads or dump devices.
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: > 16TB worth of data question

From
Ron Johnson
Date:
On Tue, 2003-04-29 at 16:05, Jim C. Nasby wrote:
> On Tue, Apr 29, 2003 at 01:01:39AM -0500, Ron Johnson wrote:
> > But pg doesn't guarantee internal consistency unless you pg_dump
> > the database in one command "pg_dump db_name > db_yyyymmdd.dmp".
> >
> > Thus, no parallelism unless there are multiple databases, but if there's
> > only 1 database...
>
> It would probably be useful if you could feed a specific transaction
> id/timestamp to pg_dump, so that it will take the snapshot as of that
> time. Of course, it'd probably be easiest if you could just tell pg_dump
> to use X number of threads or dump devices.

Something like:
# pg_dump -f /dev/sg0 -f /dev/sg1 -f /dev/sg2 -f /dev/sg3 <dbname>
  OR
# pg_dump -f \(/dev/sg0,/dev/sg1,/dev/sg2,/dev/sg3\) <dbname>

Then, pg_dump would decide what data to send to each device.

--
+-----------------------------------------------------------+
| Ron Johnson, Jr.     Home: ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson |
|                                                           |
| An ad currently being run by the NEA (the US's biggest    |
| public school TEACHERS UNION) asks a teenager if he can   |
| find sodium and *chloride* in the periodic table of the   |
| elements.                                                 |
| And they wonder why people think public schools suck...   |
+-----------------------------------------------------------+