Re: How to allocate 8 disks - Mailing list pgsql-performance

From Greg Smith
Subject Re: How to allocate 8 disks
Date
Msg-id Pine.GSO.4.64.0803012322030.10345@westnet.com
Whole thread Raw
In response to Re: How to allocate 8 disks  (Craig James <craig_james@emolecules.com>)
List pgsql-performance
On Sat, 1 Mar 2008, Craig James wrote:

> So my question still stands: From a strictly performance point of view, would
> it be better to separate the OS and the WAL onto two disks?

You're not getting a more useful answer here because you haven't mentioned
yet a) what the disk controller is or b) how much writing activity is
going on here.  If you can cache writes, most of the advantages to having
a seperate WAL disk aren't important unless you've got an extremely high
write throughput (higher you can likely sustain with only 8 disks) so you
can put the WAL data just about anywhere.

> This is a dedicated system and does nothing but Apache/Postgres, so the OS
> should get very little traffic.  But if that's the case, I guess you could
> argue that your suggestion of combining OS and WAL on a 2-disk RAID 1 would
> be the way to go, since the OS activity wouldn't affect the WAL very much.

The main thing to watch out for if the OS and WAL are on the same disk is
that some random process spewing logs files could fill the disk and now
the database is stalled.

I think there are two configurations that make sense for your situation:

>   8 disks   RAID 1+0  Everything

This maximizes potential sequential and seek throughput for the database,
which is probably going to be your bottleneck unless you're writing lots
of simple data, while still allowing survival of any one disk.  The crazy
log situation I mentioned above is less likely to be a problem because
having so much more disk space available to everything means it's more
likely you'll notice it before the disk actually fills.

     6 disks   RAID 0  Postgres data+WAL
     2 disks   RAID 1  Linux

This puts some redundancy on the base OS, so no single disk loss can
actually take down the system altogether.  You get maximum throughput on
the database.  If you lose a database disk, you replace it and rebuild the
whole database at that point.

> I suppose the thing to do is get the system, and run bonnie on various
> configurations.  I've never run bonnie before -- can I get some useful
> results without a huge learning curve?

I've collected some bonnie++ examples at
http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm you
may find useful.  With only 8 disks you should be able to get useful
results without a learning curve; with significantly more it can be
necessary to run more than one bonnie at once to really saturate the disks
and that's trickier.

I don't think you're going to learn anything useful from that though
(other than figuring out if your disk+controller combination is
fundamentally fast or not).  As you put more disks into the array,
sequential throughput and seeks/second will go up.  This doesn't tell you
anything useful about whether the WAL is going to get enough traffic to be
a bottleneck such that it needs to be on a seperate disk.  To figure that
out, you need to run some simulations of the real database and its
application, and doing that fairly is a more serious benchmarking project.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: 12 disks raid setup
Next
From: "Steve Poe"
Date:
Subject: How to choose a disc array for Postgresql?