Thread: Large database

Large database

From
"Graham Daley"
Date:
I'm considering using Postgresql for an application which could very likely require a database to grow up to about 1TB
insize. 

1. Is this possible with Postresql on Linux?

2. Is it possible for database tables to span files on more than one disk partition? I'm aware that it splits files
into2GB chunks automatically, but is it possible for these files to reside on more than one partition? 

3. What's the biggest known database currently running on Postgresql?

Thanks for your help!


Graham.

--

MOMENTUM INTERNET INC
Internet Publishers
Website: http://www.momentumplus.com
E-Mail:  gd@PINmail.com

Re: [GENERAL] Large database

From
Rachel Greenham
Date:
Graham Daley wrote:
>
> I'm considering using Postgresql for an application which could very likely require a database to grow up to about
1TBin size. 
>
> 1. Is this possible with Postresql on Linux?
>
> 2. Is it possible for database tables to span files on more than one disk partition? I'm aware that it splits files
into2GB chunks automatically, but is it possible for these files to reside on more than one partition? 
>
> 3. What's the biggest known database currently running on Postgresql?
>
> Thanks for your help!,

I think in all honesty you might be better off with Oracle, probably
under Solaris. If it's that big a project the investment is probably
worthwhile. I hate to say this, as I love Linux and Postgres. Linux can
now apparently handle 1Tb partitions but it's still Experimental, so
while Oracle does now exist for Linux, if it's important I'd say go for
a Solaris solution.

I'd love to be proven wrong on this BTW, but the "use Oracle" advice
does come from a Linux kernel core team member to whom I mentioned this.
(Terribly useful to be living with such a person. :-))

While Linux can run on pretty big iron these days, it doesn't really
scale up to really big systems very elegantly. This is set to change
with 2.4 of course, but right now Linux aint happy going that big.

--
Rachel


Re: [GENERAL] Large database

From
Ole Gjerde
Date:
On Tue, 17 Aug 1999, Graham Daley wrote:
> I'm considering using Postgresql for an application which could very
> likely require a database to grow up to about 1TB in size.
> 1. Is this possible with Postresql on Linux?

If you run linux on SPARC, Alpha, SGI or any other 64 bit platform, it
should be possible(Or have a 64 bit filesystem on intel, but that doesn't
exist yet).

> 2. Is it possible for database tables to span files on more than one disk
>    partition? I'm aware that it splits files into 2GB chunks
>    automatically, but is it possible for these files to reside on more
>    than one partition?

Easiest way to do this would be to put all your drives in a RAID.. Either
just striping or RAID 5 if you need the redundancy(probably).

> 3. What's the biggest known database currently running on Postgresql?

No clue..  The biggest one I'm running right now is about 5GB which isn't
all that much..  I'm sure there are much bigger ones out there..

Ole Gjerde


Re: [GENERAL] Large database

From
jim@reptiles.org (Jim Mercer)
Date:
> > 2. Is it possible for database tables to span files on more than one disk
> >    partition? I'm aware that it splits files into 2GB chunks
> >    automatically, but is it possible for these files to reside on more
> >    than one partition?
>
> Easiest way to do this would be to put all your drives in a RAID.. Either
> just striping or RAID 5 if you need the redundancy(probably).

oddly enough, we had 2GB chunks with 6.4, but now with 6.5.1 (on FreeBSD)
we are seeing 1GB chunks:

-rw-------  1 pgsql  wheel  1073741824 Aug 15 08:23 gt
-rw-------  1 pgsql  wheel  1073741824 Aug 15 08:29 gt.1
-rw-------  1 pgsql  wheel   626548736 Aug 17 12:29 gt.2
-rw-------  1 pgsql  wheel   302964736 Aug 17 12:24 gt_callid_key
(we've had as much as gt.7)

doesn't seem to have broken anything.

it would be nice if vacuum could figure out if an extent is no longer
in use, and delete it.

currently, we need to dump the table, then nuke it and restore it.

> > 3. What's the biggest known database currently running on Postgresql?
>
> No clue..  The biggest one I'm running right now is about 5GB which isn't
> all that much..  I'm sure there are much bigger ones out there..

my table varies in size from 1 - 7 gig.
it has millions of records with a single index.

works for the most part, the problems we have are usually related to the
hardware crashing, and since we run postgres without the sync flag, well,
things can get corrupted.

--
[ Jim Mercer    Reptilian Research      jim@reptiles.org   +1 416 410-5633 ]
[ The telephone, for those of you who  have forgotten, was a commonly used ]
[ communications technology in the days before electronic mail.            ]
[ They're still easy to find in most large cities. -- Nathaniel Borenstein ]

Re: [GENERAL] Large database

From
Bruce Momjian
Date:
> > > 2. Is it possible for database tables to span files on more than one disk
> > >    partition? I'm aware that it splits files into 2GB chunks
> > >    automatically, but is it possible for these files to reside on more
> > >    than one partition?
> >
> > Easiest way to do this would be to put all your drives in a RAID.. Either
> > just striping or RAID 5 if you need the redundancy(probably).
>
> oddly enough, we had 2GB chunks with 6.4, but now with 6.5.1 (on FreeBSD)
> we are seeing 1GB chunks:
>
> -rw-------  1 pgsql  wheel  1073741824 Aug 15 08:23 gt
> -rw-------  1 pgsql  wheel  1073741824 Aug 15 08:29 gt.1
> -rw-------  1 pgsql  wheel   626548736 Aug 17 12:29 gt.2
> -rw-------  1 pgsql  wheel   302964736 Aug 17 12:24 gt_callid_key
> (we've had as much as gt.7)
>
> doesn't seem to have broken anything.

Yes, this was done because we found some OS's couldn't handle chunks of
exactly 2G, so we reduced it, and 1G seemed a good number.

>
> it would be nice if vacuum could figure out if an extent is no longer
> in use, and delete it.
>
> currently, we need to dump the table, then nuke it and restore it.

We tried removing the extra extent, but the other backends needed it
around because there was no way to tell them that the extra extent file
descriptor was useless.  You could just stop the postmaster, and delete
the zero-length extents, if you really wanted to.  Do the extra extent
files cause problems for you?  If so, we can start to look at some
postmaster cleanup of those.


>
> > > 3. What's the biggest known database currently running on Postgresql?
> >
> > No clue..  The biggest one I'm running right now is about 5GB which isn't
> > all that much..  I'm sure there are much bigger ones out there..
>
> my table varies in size from 1 - 7 gig.
> it has millions of records with a single index.
>
> works for the most part, the problems we have are usually related to the
> hardware crashing, and since we run postgres without the sync flag, well,
> things can get corrupted.


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [GENERAL] Large database

From
jim@reptiles.org (Jim Mercer)
Date:
> > it would be nice if vacuum could figure out if an extent is no longer
> > in use, and delete it.
> >
> > currently, we need to dump the table, then nuke it and restore it.
>
> We tried removing the extra extent, but the other backends needed it
> around because there was no way to tell them that the extra extent file
> descriptor was useless.  You could just stop the postmaster, and delete
> the zero-length extents, if you really wanted to.  Do the extra extent
> files cause problems for you?  If so, we can start to look at some
> postmaster cleanup of those.

i haven't actually seen the system reduce a numbered extent.

i could be wrong.

--
[ Jim Mercer    Reptilian Research      jim@reptiles.org   +1 416 410-5633 ]
[ The telephone, for those of you who  have forgotten, was a commonly used ]
[ communications technology in the days before electronic mail.            ]
[ They're still easy to find in most large cities. -- Nathaniel Borenstein ]

Re: [GENERAL] Large database

From
Bruce Momjian
Date:
> > > it would be nice if vacuum could figure out if an extent is no longer
> > > in use, and delete it.
> > >
> > > currently, we need to dump the table, then nuke it and restore it.
> >
> > We tried removing the extra extent, but the other backends needed it
> > around because there was no way to tell them that the extra extent file
> > descriptor was useless.  You could just stop the postmaster, and delete
> > the zero-length extents, if you really wanted to.  Do the extra extent
> > files cause problems for you?  If so, we can start to look at some
> > postmaster cleanup of those.
>
> i haven't actually seen the system reduce a numbered extent.
>
> i could be wrong.

You mean it never gets smaller, or never removes it.  It currently
doesn't remove it, but vacuum should make then zero size when they are
not needed.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026