Re: What's the best hardver for PostgreSQL 8.1? - Mailing list pgsql-performance

From Ron
Subject Re: What's the best hardver for PostgreSQL 8.1?
Date
Msg-id 6.2.5.6.0.20051227110304.01dc1000@earthlink.net
Whole thread Raw
In response to Re: What's the best hardver for PostgreSQL 8.1?  (Michael Stone <mstone+postgres@mathom.us>)
Responses Re: What's the best hardver for PostgreSQL 8.1?
Re: What's the best hardver for PostgreSQL 8.1?
List pgsql-performance
At 08:35 AM 12/27/2005, Michael Stone wrote:
>On Mon, Dec 26, 2005 at 10:11:00AM -0800, David Lang wrote:
>>what slows down raid 5 is that to modify a block you have to read
>>blocks from all your drives to re-calculate the parity. this
>>interleaving of reads and writes when all you are logicly doing is
>>writes can really hurt. (this is why I asked the question that got
>>us off on this tangent, when doing new writes to an array you don't
>>have to read the blocks as they are blank, assuming your cacheing
>>is enough so that you can write blocksize*n before the system
>>starts actually writing the data)
>
>Correct; there's no reason for the controller to read anything back
>if your write will fill a complete stripe. That's why I said that
>there isn't a "RAID 5 penalty" assuming you've got a reasonably fast
>controller and you're doing large sequential writes (or have enough
>cache that random writes can be batched as large sequential writes).

Sorry.  A decade+ RWE in production with RAID 5 using controllers as
bad as Adaptec and as good as Mylex, Chaparral, LSI Logic (including
their Engino stuff), and Xyratex under 5 different OS's (Sun, Linux,
M$, DEC, HP) on each of Oracle, SQL Server, DB2, mySQL, and pg shows
that RAID 5 writes are slower than RAID 5 reads

With the one notable exception of the Mylex controller that was so
good IBM bought Mylex to put them out of business.

Enough IO load, random or sequential, will cause the effect no matter
how much cache you have or how fast the controller is.

The even bigger problem that everyone is ignoring here is that large
RAID 5's spend increasingly larger percentages of their time with 1
failed HD in them.  The math of having that many HDs operating
simultaneously 24x7 makes it inevitable.

This means you are operating in degraded mode an increasingly larger
percentage of the time under exactly the circumstance you least want
to be.  In addition, you are =one= HD failure from data loss on that
array an increasingly larger percentage of the time under exactly the
least circumstances you want to be.

RAID 5 is not a silver bullet.


>  On Mon, Dec 26, 2005 at 06:04:40PM -0500, Alex Turner wrote:
>>Yes, but those blocks in RAID 10 are largely irrelevant as they are
>>to independant disks.  In RAID 5 you have to write parity to an
>>'active' drive that is part of the stripe.
>
>Once again, this doesn't make any sense. Can you explain which parts of
>a RAID 10 array are inactive?
>
>>I agree totally that the read+parity-calc+write in the worst case
>>is totaly bad, which is why I alway recommend people should _never
>>ever_ use RAID 5.   In this day and age of large capacity chassis,
>>and large capacity SATA drives, RAID 5 is totally inapropriate IMHO
>>for _any_ application least of all databases.
I vote with Michael here.  This is an extreme position to take that
can't be followed under many circumstances ITRW.


>So I've got a 14 drive chassis full of 300G SATA disks and need at
>least 3.5TB of data storage. In your mind the only possible solution
>is to buy another 14 drive chassis? Must be nice to never have a budget.

I think you mean an infinite budget.  That's even assuming it's
possible to get the HD's you need.  I've had arrays that used all the
space I could give them in 160 HD cabinets.  Two 160 HD cabinets was
neither within the budget nor going to perform well.  I =had= to use
RAID 5.  RAID 10 was just not usage efficient enough.


>Must be a hard sell if you've bought decent enough hardware that
>your benchmarks can't demonstrate a difference between a RAID 5 and
>a RAID 10 configuration on that chassis except in degraded mode (and
>the customer doesn't want to pay double for degraded mode performance)

I have =never= had this situation.  RAID 10 latency is better than
RAID 5 latency.  RAID 10 write speed under heavy enough load, of any
type, is faster than RAID 5 write speed under the same
circumstances.  RAID 10 robustness is better as well.

Problem is that sometimes budget limits or number of HDs needed
limits mean you can't use RAID 10.


>>In reality I have yet to benchmark a system where RAID 5 on the
>>same number of drives with 8 drives or less in a single array beat
>>a RAID 10 with the same number of drives.
>
>Well, those are frankly little arrays, probably on lousy controllers...
Nah.  Regardless of controller I can take any RAID 5 and any RAID 10
built on the same HW under the same OS running the same DBMS and
=guarantee= there is an IO load above which it can be shown that the
RAID 10 will do writes faster than the RAID 5.  The only exception in
my career thus far has been the aforementioned Mylex controller.

OTOH, sometimes you have no choice but to "take the hit" and use RAID 5.


cheers,
Ron



pgsql-performance by date:

Previous
From: Albert Cervera Areny
Date:
Subject: Performance problems with 8.1.1 compared to 7.4.7
Next
From: Michael Fuhr
Date:
Subject: Re: Performance problems with 8.1.1 compared to 7.4.7