Re: Hardware needed for 15,000,000 record DB? - Mailing list pgsql-admin

From Jeremy Buchmann
Subject Re: Hardware needed for 15,000,000 record DB?
Date
Msg-id 3CC48BF6.8080006@wellsgaming.com
Whole thread Raw
In response to Hardware needed for 15,000,000 record DB?  (<pdg@stratos.net>)
Responses Re: Hardware needed for 15,000,000 record DB?
List pgsql-admin
pdg@stratos.net wrote:
> Hello,  :-)
>
> Can someone help me spec out the hardware needed for a simple web-based
> database app?
>
> Basically, that application needs to lookup a single row by its primary key.
> This would be fairly straightforward, except that the table needs to contain
> up to 15 million records.  Each row will contain approximately 25  variable
> length CHAR fields with perhaps a total of 3 to 4 kilobytes of data per row.
>
> Updates will be done nightly via some sort of a batch process.
>
> What kind of hardware would be needed for this sort of application? The
> queries are not complex, it's just a lot of data.

Databases are I/O bound...if your I/O is slow, your database will be slow.
So your goal is to minimize the amount of I/O needed and the time it takes
to do the I/O.  You minimize the amount of I/O by getting things with big
caches.  You minimize the amount of time the I/O takes by using fast
storage devices.

This means SCSI.  If you have limited funds, spend them on fast SCSI disks
with big caches and a good controller instead the latest and greatest
processor.

If you have the funds, look into getting a RAID card with a big cache on it.
A RAID 1 or 5 also helps out if a disk crashes.

> Would a midrange Celeron processor with 256 MB RAM be sufficient? How would
> backups work for a database this large?

I would not use the Celeron.  The Celeron lacks cache, and you want all the
cache you can afford.  One user posted benchmarks to this list showing the
Celeron to be twice as slow as a PIII at the same clock speed.

Buy a lot of memory.  And make sure the motherboard you get can support a lot
of memory (i.e., it has 4 or more slots for RAM).

Backups: DDS3/4 tape is a good, but expensive choice.  It's one of the few backup
technologies that has cheap media and is vendor-neutral.  Stay away from
"Big floppies" like Jaz drives.  They're too flaky.

> Is PostgreSQL even the best database engine for this app? Perhaps MySQL? Or
> maybe a Microsoft solution?

MySQL is traditionally faster at pumping out web pages with pure speed, but
PostgreSQL has been catching up very quickly.  Also, PostgreSQL has
traditionally been able to handle many more concurrent users, but I think
MySQL has been getting better there, too.  I haven't seen any banchmarks
with recent versions of either database.  They're both free, so try some
tests with both of them (but don't forget to tune them properly).

If you care anything about cost or flexibility, I wouldn't go for a Microsoft
"solution".


--Jeremy


pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: An Analyze question
Next
From: "Ross J. Reedstrom"
Date:
Subject: Re: An Analyze question