Re: What's the best hardver for PostgreSQL 8.1? - Mailing list pgsql-performance

From Bruce Momjian
Subject Re: What's the best hardver for PostgreSQL 8.1?
Date
Msg-id 200512271751.jBRHpw307927@candle.pha.pa.us
Whole thread Raw
In response to Re: What's the best hardver for PostgreSQL 8.1?  (Ron <rjpeace@earthlink.net>)
Responses Re: What's the best hardver for PostgreSQL 8.1?  ("Luke Lonergan" <llonergan@greenplum.com>)
Re: What's the best hardver for PostgreSQL 8.1?  ("Luke Lonergan" <llonergan@greenplum.com>)
List pgsql-performance
Historically, I have heard that RAID5 is only faster than RAID10 if
there are six or more drives.

---------------------------------------------------------------------------

Ron wrote:
> At 08:35 AM 12/27/2005, Michael Stone wrote:
> >On Mon, Dec 26, 2005 at 10:11:00AM -0800, David Lang wrote:
> >>what slows down raid 5 is that to modify a block you have to read
> >>blocks from all your drives to re-calculate the parity. this
> >>interleaving of reads and writes when all you are logicly doing is
> >>writes can really hurt. (this is why I asked the question that got
> >>us off on this tangent, when doing new writes to an array you don't
> >>have to read the blocks as they are blank, assuming your cacheing
> >>is enough so that you can write blocksize*n before the system
> >>starts actually writing the data)
> >
> >Correct; there's no reason for the controller to read anything back
> >if your write will fill a complete stripe. That's why I said that
> >there isn't a "RAID 5 penalty" assuming you've got a reasonably fast
> >controller and you're doing large sequential writes (or have enough
> >cache that random writes can be batched as large sequential writes).
>
> Sorry.  A decade+ RWE in production with RAID 5 using controllers as
> bad as Adaptec and as good as Mylex, Chaparral, LSI Logic (including
> their Engino stuff), and Xyratex under 5 different OS's (Sun, Linux,
> M$, DEC, HP) on each of Oracle, SQL Server, DB2, mySQL, and pg shows
> that RAID 5 writes are slower than RAID 5 reads
>
> With the one notable exception of the Mylex controller that was so
> good IBM bought Mylex to put them out of business.
>
> Enough IO load, random or sequential, will cause the effect no matter
> how much cache you have or how fast the controller is.
>
> The even bigger problem that everyone is ignoring here is that large
> RAID 5's spend increasingly larger percentages of their time with 1
> failed HD in them.  The math of having that many HDs operating
> simultaneously 24x7 makes it inevitable.
>
> This means you are operating in degraded mode an increasingly larger
> percentage of the time under exactly the circumstance you least want
> to be.  In addition, you are =one= HD failure from data loss on that
> array an increasingly larger percentage of the time under exactly the
> least circumstances you want to be.
>
> RAID 5 is not a silver bullet.
>
>
> >  On Mon, Dec 26, 2005 at 06:04:40PM -0500, Alex Turner wrote:
> >>Yes, but those blocks in RAID 10 are largely irrelevant as they are
> >>to independant disks.  In RAID 5 you have to write parity to an
> >>'active' drive that is part of the stripe.
> >
> >Once again, this doesn't make any sense. Can you explain which parts of
> >a RAID 10 array are inactive?
> >
> >>I agree totally that the read+parity-calc+write in the worst case
> >>is totaly bad, which is why I alway recommend people should _never
> >>ever_ use RAID 5.   In this day and age of large capacity chassis,
> >>and large capacity SATA drives, RAID 5 is totally inapropriate IMHO
> >>for _any_ application least of all databases.
> I vote with Michael here.  This is an extreme position to take that
> can't be followed under many circumstances ITRW.
>
>
> >So I've got a 14 drive chassis full of 300G SATA disks and need at
> >least 3.5TB of data storage. In your mind the only possible solution
> >is to buy another 14 drive chassis? Must be nice to never have a budget.
>
> I think you mean an infinite budget.  That's even assuming it's
> possible to get the HD's you need.  I've had arrays that used all the
> space I could give them in 160 HD cabinets.  Two 160 HD cabinets was
> neither within the budget nor going to perform well.  I =had= to use
> RAID 5.  RAID 10 was just not usage efficient enough.
>
>
> >Must be a hard sell if you've bought decent enough hardware that
> >your benchmarks can't demonstrate a difference between a RAID 5 and
> >a RAID 10 configuration on that chassis except in degraded mode (and
> >the customer doesn't want to pay double for degraded mode performance)
>
> I have =never= had this situation.  RAID 10 latency is better than
> RAID 5 latency.  RAID 10 write speed under heavy enough load, of any
> type, is faster than RAID 5 write speed under the same
> circumstances.  RAID 10 robustness is better as well.
>
> Problem is that sometimes budget limits or number of HDs needed
> limits mean you can't use RAID 10.
>
>
> >>In reality I have yet to benchmark a system where RAID 5 on the
> >>same number of drives with 8 drives or less in a single array beat
> >>a RAID 10 with the same number of drives.
> >
> >Well, those are frankly little arrays, probably on lousy controllers...
> Nah.  Regardless of controller I can take any RAID 5 and any RAID 10
> built on the same HW under the same OS running the same DBMS and
> =guarantee= there is an IO load above which it can be shown that the
> RAID 10 will do writes faster than the RAID 5.  The only exception in
> my career thus far has been the aforementioned Mylex controller.
>
> OTOH, sometimes you have no choice but to "take the hit" and use RAID 5.
>
>
> cheers,
> Ron
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-performance by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: Performance problems with 8.1.1 compared to 7.4.7
Next
From: Albert Cervera Areny
Date:
Subject: Re: Performance problems with 8.1.1 compared to 7.4.7