Re: What exactly is postgres doing during INSERT/UPDATE ? - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: What exactly is postgres doing during INSERT/UPDATE ?
Date
Msg-id b42b73150908300840o354b345fje4adf973f94a1ec2@mail.gmail.com
Whole thread Raw
In response to Re: What exactly is postgres doing during INSERT/UPDATE ?  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re: What exactly is postgres doing during INSERT/UPDATE ?
Re: What exactly is postgres doing during INSERT/UPDATE ?
List pgsql-performance
On Sat, Aug 29, 2009 at 9:59 AM, Scott Marlowe<scott.marlowe@gmail.com> wrote:
> On Sat, Aug 29, 2009 at 2:46 AM, Greg Stark<gsstark@mit.edu> wrote:
>> On Sat, Aug 29, 2009 at 5:20 AM, Luke Koops<luke.koops@entrust.com> wrote:
>>> Joseph S Wrote
>>>> If I have 14 drives in a RAID 10 to split between data tables
>>>> and indexes what would be the best way to allocate the drives
>>>> for performance?
>>>
>>> RAID-5 can be much faster than RAID-10 for random reads and writes.  It is much slower than RAID-10 for sequential
writes,but about the same for sequential reads.  For typical access patterns, I would put the data and indexes on
RAID-5unless you expect there to be lots of sequential scans. 
>>
>> That's pretty much exactly backwards. RAID-5 will at best slightly
>> slower than RAID-0 or RAID-10 for sequential reads or random reads.
>> For sequential writes it performs *terribly*, especially for random
>> writes. The only write pattern where it performs ok sometimes is
>> sequential writes of large chunks.
>
> Note that while RAID-10 is theoretically always better than RAID-5,
> I've run into quite a few cheapie controllers that were heavily
> optimised for RAID-5 and de-optimised for RAID-10.  However, if it's
> got battery backed cache and can run in JBOD mode, linux software
> RAID-10 or hybrid RAID-1 in hardware RAID-0 in software will almost
> always beat hardware RAID-5 on the same controller.


raid 5 can outperform raid 10 on sequential writes in theory.  if you
are writing 100mb of actual data on, say, a 8 drive array, the raid 10
system has to write 200mb data and the raid 5 system has to write 100
* (8/7) or about 114mb.  Of course, the raid 5 system has to do
parity, etc.

For random writes, raid 5 has to write a minimum of two drives, the
data being written and parity.  Raid 10 also has to write two drives
minimum.  A lot of people think parity is a big deal in terms of raid
5 performance penalty, but I don't -- relative to the what's going on
in the drive, xor calculation costs (one of the fastest operations in
computing) are basically zero, and off-lined if you have a hardware
raid controller.

I bet part of the problem with raid 5 is actually contention. since
your write to a stripe can conflict with other writes to a different
stripe.  The other problem with raid 5 that I see is that you don't
get very much extra protection -- it's pretty scary doing a rebuild
even with a hot spare (and then you should probably be doing raid 6).
On read performance RAID 10 wins all day long because more drives can
be involved.

merlin

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: What exactly is postgres doing during INSERT/UPDATE ?
Next
From: Greg Stark
Date:
Subject: Re: What exactly is postgres doing during INSERT/UPDATE ?