Re: best use of an EMC SAN - Mailing list pgsql-performance

From Chris Browne
Subject Re: best use of an EMC SAN
Date
Msg-id 60zm22x2jo.fsf@dba2.int.libertyrms.com
Whole thread Raw
In response to best use of an EMC SAN  (Dave Cramer <pg@fastcrypt.com>)
Responses Re: best use of an EMC SAN  (Jim Nasby <decibel@decibel.org>)
Re: best use of an EMC SAN  (Andrew Sullivan <ajs@crankycanuck.ca>)
List pgsql-performance
pg@fastcrypt.com (Dave Cramer) writes:
> On 11-Jul-07, at 10:05 AM, Gregory Stark wrote:
>
>> "Dave Cramer" <pg@fastcrypt.com> writes:
>>
>>> Assuming we have 24 73G drives is it better to make one big
>>> metalun and carve
>>> it up and let the SAN manage the where everything is, or is  it
>>> better to
>>> specify which spindles are where.
>>
>> This is quite a controversial question with proponents of both
>> strategies.
>>
>> I would suggest having one RAID-1 array for the WAL and throw the
>> rest of the
>
> This is quite unexpected. Since the WAL is primarily all writes,
> isn't a RAID 1 the slowest of all for writing ?

The thing is, the disk array caches this LIKE CRAZY.  I'm not quite
sure how many batteries are in there to back things up; there seems to
be multiple levels of such, which means that as far as fsync() is
concerned, the data is committed very quickly even if it takes a while
to physically hit disk.

One piece of the controversy will be that the disk being used for WAL
is certain to be written to as heavily and continuously as your heavy
load causes.  A fallout of this is that those disks are likely to be
worked harder than the disk used for storing "plain old data," with
the result that if you devote disk to WAL, you'll likely burn thru
replacement drives faster there than you do for the "POD" disk.

It is not certain whether it is more desirable to:
a) Spread that wear and tear across the whole array, or
b) Target certain disks for that wear and tear, and expect to need to
   replace them somewhat more frequently.

At some point, I'd like to do a test on a decent disk array where we
take multiple configurations.  Assuming 24 drives:

 - Use all 24 to make "one big filesystem" as the base case
 - Split off a set (6?) for WAL
 - Split off a set (6?  9?) to have a second tablespace, and shift
   indices there

My suspicion is that the "use all 24 for one big filesystem" scenario
is likely to be fastest by some small margin, and that the other cases
will lose a very little bit in comparison.  Andrew Sullivan had a
somewhat similar finding a few years ago on some old Solaris hardware
that unfortunately isn't at all relevant today.  He basically found
that moving WAL off to separate disk didn't affect performance
materially.

What's quite regrettable is that it is almost sure to be difficult to
construct a test that, on a well-appointed modern disk array, won't
basically stay in cache.
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/nonrdbms.html
16-inch Rotary Debugger: A highly effective tool for locating problems
in  computer   software.   Available   for  delivery  in   most  major
metropolitan areas.  Anchovies contribute to poor coding style.

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostgreSQL publishes first real benchmark
Next
From: Jim Nasby
Date:
Subject: Re: best use of an EMC SAN