Thread: how to determine the hardware I need

how to determine the hardware I need

From
raphael@be.easynet.net
Date:
Hi,

I need to determine the hardware I need to host a postgresql database.
The problem is, I have to make some guess on the number of tables,
records/table, etc as it is a new website that will be launched. Needs
will evolve, and hardware has to be prepared.
The other problem is I don't know how to evaluate hardware needs based
on assumptions about the number of tables, rows/tables, number of
queries/day, ....

I browsed the history of the ML, but I didn't see an answer to this
question: how does one evaluate the hardware needed?
From my readings, I have this idea:

RAM is important, but how do I evaluate how much RAM I need?
Putting logs on a different disk than data is important.
RAID: doesn't give a performance boost, but increases reliability.
CPU power isn't that important.
SCSI can be more reliable in extreme circonstances, but IDE does a good
job also.

Remaining questions, knowing that Apache will run on the same computer:
-How do I evaluate how much RAMI need?
-idem CPU. is a dual CPU an advantage?
-how much space will the database take?

If you have good pointers to documentation, it would be very helpful
too!

Thanks for your help!


Raph


Re: how to determine the hardware I need

From
Curt Sampson
Date:
You neglected to say what kind of price ranges are reasonable and not
reasonable for you. This makes a difference, because there are certain
price ranges and hardware configurations that are "sweet spots."

The typical example: on my projects with "real" budgets, spending a
couple grand on a server is no big deal. So our basic system is a 2 GHz
or faster processor, 1 GB of RAM, four IDE disks (two mirrored volumes),
and an extra IDE controller. The few hundred dollars we might save by
downgrading this specification just isn't worth the extra effort it
would take to figure out what to downgrade.

But the above is about the best system you're going to get for really
cheap commodity prices. If you need something bigger, you're going to be
spending a fair amount more, so it's worth doing some more work to find
out exactly what it is you need.

As far as a few specifics go:

> Putting logs on a different disk than data is important.

Yes. But not quite so important if you're not doing a lot of updates. So
if cost is a big concern, and you're not doing a lot of updates, you may
want to save yourself the cost of the extra disks here.

> RAID: doesn't give a performance boost, but increases reliability.

Right. In fact, RAID-5 will give a performance hit, so don't use
it unless you have to. Use RAID-1 instead if you can afford to.

> CPU power isn't that important.

Actually, postgres seems to want more CPU than other database products
I've used, so check this out carefully before you decide it's not so
important.

> SCSI can be more reliable in extreme circonstances, but IDE does a good
> job also.

Right. IDE's main failing is in amount of storage; if you need more
than half a terrabyte to a terrabyte, IDE is not an option at all. If
you need hot swap, SCSI will be a lot more reliable. If you need an
external disk array (good if a motherboard fails; just plug the disks
into another machine) you pretty much need to use SCSI.

> -idem CPU. is a dual CPU an advantage?

If a single CPU system can provide enough horsepower, you should always
go for that; dual CPU systems always waste far more CPU cycles than
single-CPU systems, and if you've got the same amount of horsepower,
you'll have higher latency.

Multiple CPUs are only going to be useful if you have things that
can be run in parallel.

> -how much space will the database take?

Well, you can go through and figure out your schemas and how many rows
you have and work it out, but it's probably faster and easier just to
generate a sample data set and load it up on a development machine to
see how big it is.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC