Re: PowerEdge 2950 questions - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: PowerEdge 2950 questions
Date
Msg-id b42b73150608240621v453e3aeen44f09cd274f9a7@mail.gmail.com
Whole thread Raw
In response to Re: PowerEdge 2950 questions  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: PowerEdge 2950 questions
List pgsql-performance
On 8/22/06, Jeff Davis <pgsql@j-davis.com> wrote:
> On Tue, 2006-08-22 at 17:56 -0400, Bucky Jordan wrote:
> Very interesting. I always hear that people avoid RAID 5 on database
> servers, but I suppose it always depends. Is the parity calculation
> something that may increase commit latency vs. a RAID 10? That's
> normally the explanation that I get.

it's not the parity, it's the seeking.  Raid 5 gives you great
sequential i/o but random is often not much better than a single
drive.  Actually it's the '1' in raid 10 that plays the biggest role
in optimizing seeks on an ideal raid controller.  Calculating parity
was boring 20 years ago as it inolves one of the fastest operations in
computing, namely xor. :)

> > If I remember correctly, the numbers were pretty close, but I was
> > expecting RAID10 to significantly beat RAID5. However, with 6 disks,
> > RAID5 starts performing a little better, and it also has good storage
> > utilization (i.e. you're only loosing 1 disk's worth of storage, so with
> > 6 drives, you still have 83% - 5/6 - of your storage available, as
> > opposed to 50% with RAID10).

with a 6 disk raid 5, you absolutely have a hot spare in the array.
an alternative is raid 6, which is two parity drives, however there is
not a lot of good data on how raid 6 performs (ideally should be
similar to raid 5). raid 5 is ideal for some things, for example
document storage or in databases where most of the activity takes
place in a small portion of the disks most of the time.

> Right, RAID 5 is certainly tempting since I get so much more storage.
>
> > Keep in mind that with 6 disks, theoretically (your mileage may vary by
> > raid controller implementation) you have more fault tolerance with
> > RAID10 than with RAID5.
>
> I'll also have the Slony system, so I think my degree of safety is still
> quite high with RAID-5.
>
> > Also, I don't think there's a lot of performance gain to going with the
> > 15k drives over the 10k. Even dell only says a 10% boost. I've
> > benchmarked a single drive configuration, 10k vs 15k rpm, and yes, the
> > 15k had substantially better seek times, but raw io isn't much
> > different, so again, it depends on your application's needs.

raw sequential i/o is actually not that important in many databases.
while the database tries to make data transfers sequential as much as
possbile (especially for writing), improved random performance often
translates directly into database performance, especially if your
database is big.

> Do you think the seek time may affect transaction commit time though,
> rather than just throughput? Or does it not make much difference since
> we have writeback?
>
> > Lastly, re your question on putting the WAL on the RAID10- I currently
> > have the box setup as RAID5x6 with the WAL and PGDATA all on the same
> > raidset. I haven't had the chance to do extensive tests, but from
> > previous readings, I gather that if you have write-back enabled on the
> > RAID, it should be ok (which it is in my case).

with 6 relatively small disks I think single raid 10 volume is the
best bet.  however above 6 dedicated wal is usually worth considering.
 since wal storage requirements are so small, it's becoming affordable
to look at solid state for the wal.

merlin

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Is this way of testing a bad idea?
Next
From: Mark Lewis
Date:
Subject: Re: Is this way of testing a bad idea?