Re: Raid 5 vs Raid 10 Benchmarks Using bonnie++ - Mailing list pgsql-performance

From Aidan Van Dyk
Subject Re: Raid 5 vs Raid 10 Benchmarks Using bonnie++
Date
Msg-id CAC_2qU_g63xWkgN8yX5ZsLaNESPkJ8Gau6FYfHZnyyXVi2zN+w@mail.gmail.com
Whole thread Raw
In response to Re: Raid 5 vs Raid 10 Benchmarks Using bonnie++  (david@lang.hm)
List pgsql-performance
On Mon, Sep 12, 2011 at 8:47 PM,  <david@lang.hm> wrote:

>> XFS FAQ  goes over much of it, starting at Q24:
>>
>>  http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F
>>
>> So, for pure performance, on a battery-backed controller, nobarrier is
>> the recommended *performance* setting.
>>
>> But, to throw a wrench into the plan, what happens when during normal
>> battery tests, your raid controller decides the battery is failing...
>> of course, it's going to start screaming and send all your monitoring
>> alarms off (you're monitoring that, right?), but have you thought to
>> make sure that your FS is remounted with barriers at the first sign of
>> battery trouble?
>
> yep.
>
> on a good raid card with battery backed cache, the performance difference
> between barriers being on and barriers being off should be minimal. If it's
> not, I think that you have something else going on.

The performance boost you'll get is that you don't have the temporary
stall in parallelization that the barriers have.  With barriers, even
if the controller cache doesn't really flush, you still have the
"can't send more writes to the device until the barrier'ed write is
done", so at all those points, you have only a single write command in
flight.  The performance penalty of barriers on good cards comes
because barriers are written to prevent the devices from reordering of
write persistence, and do that by waiting for a write to be
"persistent" before allowing more to be queued to the device.

With nobarrier, you operate under the assumption that the block device
writes are persisted in the order commands are issued to the devices,
so you never have to "drain the queue", as you do in the normal
barrier implementation, and can (in theory) always have more request
that the raid card can be working on processing, reordering, and
dispatching to platters for the maximum theoretical throughput...

Of course, linux has completely re-written/changed the
sync/barrier/flush methods over the past few years, and there is no
guarantee they don't keep changing the implementation details in the
future, so keep up on the filesystem details of whatever you're
using...

So keep doing burn-ins, with real pull-the-cord tests... They can't
"prove" it's 100% safe, but they can quickly prove when it's not ;-)

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

pgsql-performance by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Allow sorts to use more available memory
Next
From: Anthony Presley
Date:
Subject: Re: Databases optimization