Re: Very large database - Mailing list pgsql-general

From Chris Albertson
Subject Re: Very large database
Date
Msg-id 20020114193957.92807.qmail@web14704.mail.yahoo.com
Whole thread Raw
In response to Very large database  (Michael Welter <mike@introspect.com>)
List pgsql-general
Don't expect Postgresql to out perform Oracle.  If Ocacle needs
to "big iron" so will Postgresql.  I've done some testing and
found that's it's the number of transactions that matters more
then the abount of data being dumped in.  Our application was
astronomy, I would batch load a few nights of observational data
every few days.  If all you are doing is loading data you can
use COPY and it will move fast, Just a few minutes on even a
low end machine.  But, if that 120MB is in one million INSERTS
each with lots of processing, contraint checks, index updates
and so on then you will need some high end hardware to finish
in only 24 hours.  I wrote my application twice.  The first
version took __days__ to complete a run.  My second version was
100x faster.  I did much of the processing outside of the DBMS
in standard "C" and then just COPYed the data in.

So, the answer depends on what you need to do.  Simply inputting
that much data is easy.

Also, how will it by used once it is in the database?  Do you
have many active users looking at it?  What kind of seaches are
they doing.

In any case, SCSI drives are the way to go get a stack of them
with a couple on-line spares. That and LOTS of RAM.  At least
1GB as a minimum.

Solaris has very good RAID support built in.  I think better
than Linux's.  Both OSes are free although Solaris 8 will be the
last PC version.  Prototype you applacation with faked data
then try a test where you pull out the power connection on a drive
while the DBMS is updating data.  Pulling the power should have
NO effect if the RAID is set up right. Solaris found my spare drive
and swapped it in automatically.  Do this a few times before you
depend on it.  Likey either Solaris, Linux or BSD would work and
pass this test.

The big question is the transaction rate, table size is the second
question.

--- Michael Welter <mike@introspect.com> wrote:
> I need some help here.  We need to implement a 180+GB database with
> 120+MB of updates every evening.  Rather than purchasing the big
> iron,
> we would like to use postgres running over Linux 2.4.x as the data
> server.  Is this even possible?  Who has the largest database out
> there
> and what does it run on?
>
> How should we implement the disk array?  Should we purchase a
> hardware
> RAID card or should we use the software RAID capabilities in Linux
> 2.4?
>   Should we consider a SMP system?  Should we use an outboard RAID
> box
> (like RaidZone)?
>
> If anyone out there has implemented a database of this size then I
> would
> like to correspond with you.
>
> Thanks for your help,
> Mike
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)


=====
Chris Albertson
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

pgsql-general by date:

Previous
From: Elein
Date:
Subject: Re: 7.2 Beta timezone woes
Next
From: "Nandu Garg"
Date:
Subject: Problem in starting postgresql