Re: Anyone using a SAN? - Mailing list pgsql-performance

From Greg Smith
Subject Re: Anyone using a SAN?
Date
Msg-id Pine.GSO.4.64.0802131740410.24079@westnet.com
Whole thread Raw
In response to Re: Anyone using a SAN?  (Tobias Brox <tobias@nordicbet.com>)
Responses Re: Anyone using a SAN?  ("Scott Marlowe" <scott.marlowe@gmail.com>)
List pgsql-performance
On Wed, 13 Feb 2008, Tobias Brox wrote:

> What I'm told is that the state-of-the-art SAN allows for an "insane
> amount" of hard disks to be installed, much more than what would fit
> into any decent database server.

You can attach a surpringly large number of drives to a server nowadays,
but in general it's easier to manage larger numbers of them on a SAN.
Also, there are significant redundancy improvements using a SAN that are
worth quite a bit in some enterprise environments.  Being able to connect
all the drives, no matter how many, to two or more machines at once
trivially is typically easier to setup on a SAN than when you're using
more direct storage.

Basically the performance breaks down like this:

1) Going through the SAN interface (fiber channel etc.) introduces some
latency and a potential write bottleneck compared with direct storage,
everything else being equal.  This can really be a problem if you've got a
poor SAN vendor or interface issues you can't sort out.

2) It can be easier to manage a large number of disks in the SAN, so for
situations where aggregate disk throughput is the limiting factor the SAN
solution might make sense.

3) At the high-end, you can get SANs with more cache than any direct
controller I'm aware of, which for some applications can lead to them
having a more quantifiable lead over direct storage.  It's easy (albeit
expensive) to get an EMC array with 16GB worth of memory for caching on it
for example (and with 480 drives).  And since they've got a more robust
power setup than a typical server, you can even enable all the individual
drive caches usefully (that's 16-32MB each nowadays, so at say 100 disks
you've potentially got another 1.6GB of cache right there).  If you're got
a typical server you can end up needing to turn off individual direct
attached drive caches for writes, because they many not survive a power
cycle even with a UPS, and you have to just rely on the controller write
cache.

There's no universal advantage on either side here, just a different set
of trade-offs.  Certainly you'll never come close to the performance/$
direct storage gets you if you buy that in SAN form instead, but at higher
budgets or feature requirements they may make sense anyway.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: HOT TOAST?
Next
From: Tobias Brox
Date:
Subject: Re: Anyone using a SAN?