Re: Hardware needed for 15,000,000 record DB? - Mailing list pgsql-admin

From Curt Sampson
Subject Re: Hardware needed for 15,000,000 record DB?
Date
Msg-id Pine.NEB.4.43.0204231124270.447-100000@angelic.cynic.net
Whole thread Raw
In response to Re: Hardware needed for 15,000,000 record DB?  (Jeremy Buchmann <jeremy@wellsgaming.com>)
List pgsql-admin
On Mon, 22 Apr 2002, Jeremy Buchmann wrote:

> Databases are I/O bound...if your I/O is slow, your database will be slow.
> So your goal is to minimize the amount of I/O needed and the time it takes
> to do the I/O.  You minimize the amount of I/O by getting things with big
> caches.

Assuming you freqently access the same data. If you're doing
essentially random queries on a 50 GB database, an extra few hundred
megabytes of cache will probably make little difference.

> You minimize the amount of time the I/O takes by using fast
> storage devices.
> This means SCSI.

Not necessarially. More disk arms is also a big help, so much so
that I would take two IDE drives (assuming that they're fast modern
ones) over one SCSI drive any day.

You probably want to at least make sure that your log files and
data files are on separate disks.

> If you have the funds, look into getting a RAID card with a big cache on it.
> A RAID 1 or 5 also helps out if a disk crashes.

A RAID 5 will slow down write performance considerably, but the
added reliability might be worth the tradeoff. But mirroring is
better, if you can afford it. Almost certainly you want your log
file on a mirrored volume rather than on a RAID 5 volume.

As far as a "big cache" goes, well, see above. Also note that if
your OS caches blocks in system RAM, you will: a) have two caches
in series, which is a bit of a waste, and b) n MB of cache on a
raid controller will often be more expensive than the same amount
as system RAM.

> > Is PostgreSQL even the best database engine for this app? Perhaps MySQL? Or
> > maybe a Microsoft solution?
>
> MySQL is traditionally faster at pumping out web pages with pure speed, but
> PostgreSQL has been catching up very quickly.  Also, PostgreSQL has
> traditionally been able to handle many more concurrent users, but I think
> MySQL has been getting better there, too.  I haven't seen any banchmarks
> with recent versions of either database.  They're both free, so try some
> tests with both of them (but don't forget to tune them properly).

For this simple application, I'd say MySQL is likely to be a bit
faster, simply because the database will be smaller. (MySQL has
considerably less row overhead.) However, the speed difference is
not likely to be too great, and certain other things might make
MySQL even worse.

> If you care anything about cost or flexibility, I wouldn't go for a
> Microsoft "solution".

Au contraire, MS SQL Server is cheap (compared to other commerical
products) and is a pretty good database.  I would happily use it
again. But the fact that it runs under Windows only makes administration
a pain, and "cheap" in the commerical world is still a heck of a
lot more expensive than "free."

However, given the size of your project, and what you're looking
at spending, I can't see MS SQL Server or other products in that
class being worthwhile. An SQL Server installation is typically a
$10,000-$50,000 kind of thing. (You can do it for a lot less, but
most such products that can get by with such small installations
would probably be just as well off with PostgreSQL.)

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC


pgsql-admin by date:

Previous
From: Aaron Spiteri
Date:
Subject: ...
Next
From: "Lonh SENG"
Date:
Subject: Postmaster's Buffer