Thread: Dell Hardware Recommendations

Dell Hardware Recommendations

From
Joe Uhl
Date:
We have a 30 GB database (according to pg_database_size) running nicely
on a single Dell PowerEdge 2850 right now.  This represents data
specific to 1 US state.  We are in the process of planning a deployment
that will service all 50 US states.

If 30 GB is an accurate number per state that means the database size is
about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
data that is heavy-read but only updated or inserted in batch.  It is
also largely isolated to a single table partitioned on state.  This
portion of the data will grow very slowly after the initial loading.

The remaining 500 GB has frequent individual writes performed against
it.  500 GB is a high estimate and it will probably start out closer to
100 GB and grow steadily up to and past 500 GB.

I am trying to figure out an appropriate hardware configuration for such
a database.  Currently I am considering the following:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks.  500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time.  I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
difference makes SATA awfully appealing.

We plan to involve outside help in getting this database tuned and
configured, but want to get some hardware ballparks in order to get
quotes and potentially request a trial unit.

Any thoughts or recommendations?  We are running openSUSE 10.2 with
kernel 2.6.18.2-34.

Regards,

Joe Uhl
joeuhl@gmail.com


Re: Dell Hardware Recommendations

From
Decibel!
Date:
On Thu, Aug 09, 2007 at 03:47:09PM -0400, Joe Uhl wrote:
> We have a 30 GB database (according to pg_database_size) running nicely
> on a single Dell PowerEdge 2850 right now.  This represents data
> specific to 1 US state.  We are in the process of planning a deployment
> that will service all 50 US states.
>
> If 30 GB is an accurate number per state that means the database size is
> about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
> data that is heavy-read but only updated or inserted in batch.  It is
> also largely isolated to a single table partitioned on state.  This
> portion of the data will grow very slowly after the initial loading.
>
> The remaining 500 GB has frequent individual writes performed against
> it.  500 GB is a high estimate and it will probably start out closer to
> 100 GB and grow steadily up to and past 500 GB.

What kind of transaction rate are you looking at?

> I am trying to figure out an appropriate hardware configuration for such
> a database.  Currently I am considering the following:
>
> PowerEdge 1950 paired with a PowerVault MD1000
> 2 x Quad Core Xeon E5310
> 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)

16GB for 500GB of active data is probably a bit light.

> PERC 5/E Raid Adapter
> 2 x 146 GB SAS in Raid 1 for OS + logs.
> A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
>
> The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> 12 250GB SATA drives to cover the 1.5TB we would be able add another
> 250GB of usable space for future growth before needing to get a bigger
> set of disks.  500GB drives would leave alot more room and could allow
> us to run the MD1000 in split mode and use its remaining disks for other
> purposes in the mean time.  I would greatly appreciate any feedback with
> respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> difference makes SATA awfully appealing.

Well, how does this compare with what you have right now? And do you
expect your query rate to be 50x what it is now, or higher?

> We plan to involve outside help in getting this database tuned and
> configured, but want to get some hardware ballparks in order to get
> quotes and potentially request a trial unit.

You're doing a very wise thing by asking for information before
purchasing (unfortunately, many people put that cart before the horse).
This list is a great resource for information, but there's no real
substitute for working directly with someone and being able to discuss
your actual system in detail, so I'd suggest getting outside help
involved before actually purchasing or even evaluating hardware. There's
a lot to think about beyond just drives and memory with the kind of
expansion you're looking at. For example, what ability do you have to
scale past one machine? Do you have a way to control your growth rate?
How well will the existing design scale out? (Often times what is a good
design for a smaller set of data is sub-optimal for a large set of
data.)

Something else that might be worth looking at is having your existing
workload modeled; that allows building a pretty accurate estimate of
what kind of hardware would be required to hit a different workload.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

Re: Dell Hardware Recommendations

From
"Merlin Moncure"
Date:
On 8/9/07, Joe Uhl <joeuhl@gmail.com> wrote:
> We have a 30 GB database (according to pg_database_size) running nicely
> on a single Dell PowerEdge 2850 right now.  This represents data
> specific to 1 US state.  We are in the process of planning a deployment
> that will service all 50 US states.
>
> If 30 GB is an accurate number per state that means the database size is
> about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
> data that is heavy-read but only updated or inserted in batch.  It is
> also largely isolated to a single table partitioned on state.  This
> portion of the data will grow very slowly after the initial loading.
>
> The remaining 500 GB has frequent individual writes performed against
> it.  500 GB is a high estimate and it will probably start out closer to
> 100 GB and grow steadily up to and past 500 GB.
>
> I am trying to figure out an appropriate hardware configuration for such
> a database.  Currently I am considering the following:
>
> PowerEdge 1950 paired with a PowerVault MD1000
> 2 x Quad Core Xeon E5310
> 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> PERC 5/E Raid Adapter
> 2 x 146 GB SAS in Raid 1 for OS + logs.
> A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
>
> The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> 12 250GB SATA drives to cover the 1.5TB we would be able add another
> 250GB of usable space for future growth before needing to get a bigger
> set of disks.  500GB drives would leave alot more room and could allow
> us to run the MD1000 in split mode and use its remaining disks for other
> purposes in the mean time.  I would greatly appreciate any feedback with
> respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> difference makes SATA awfully appealing.

I'm getting a MD1000 tomorrow to play with for just this type of
analysis as it happens.  First of all, move the o/s drives to the
backplane and get the cheapest available.

I might consider pick up an extra perc 5/e, since the MD1000 is
active/active, and do either raid 10 or 05 with one of the raid levels
in software.  For example, two raid 5 volumes (hardware raid 5)
striped in software as raid 0.  A 15k SAS drive is worth at least two
SATA drives (unless they are raptors) for OLTP performance loads.

Where the extra controller especially pays off is if you have to
expand to a second tray.  It's easy to add trays but installing
controllers on a production server is scary.

Raid 10 is usually better for databases but in my experience it's a
roll of the dice.  If you factor cost into the matrix a SAS raid 05
might outperform a SATA raid 10 because you are getting better storage
utilization out of the drives (n - 2 vs. n / 2).  Then again, you
might not.

merlin

Re: Dell Hardware Recommendations

From
Decibel!
Date:
On Thu, Aug 09, 2007 at 05:50:10PM -0400, Merlin Moncure wrote:
> Raid 10 is usually better for databases but in my experience it's a
> roll of the dice.  If you factor cost into the matrix a SAS raid 05
> might outperform a SATA raid 10 because you are getting better storage
> utilization out of the drives (n - 2 vs. n / 2).  Then again, you
> might not.

It's going to depend heavily on the controller and the workload.
Theoretically, if most of your writes are to stripes that the controller
already has cached then you could actually out-perform RAID10. But
that's a really, really big IF, because if the strip isn't in cache you
have to read the entire thing in before you can do the write... and that
costs *a lot*.

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

Speaking of failures, keep in mind that a normal RAID5 puts you only 2
drive failures away from data loss, while with RAID10 you can
potentially lose half the array without losing any data. If you do RAID5
with multiple parity copies that does change things; I'm not sure which
is better at that point (I suspect it matters how many drives are
involved).

The comment about the extra controller isn't a bad idea, although I
would hope that you'll have some kind of backup server available, which
makes an extra controller much less useful.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

Re: Dell Hardware Recommendations

From
Arjen van der Meijden
Date:
On 9-8-2007 23:50 Merlin Moncure wrote:
> Where the extra controller especially pays off is if you have to
> expand to a second tray.  It's easy to add trays but installing
> controllers on a production server is scary.

For connectivity-sake that's not a necessity. You can either connect
(two?) extra MD1000's to your first MD1000 or you can use the second
external SAS-port on your controller. Obviously it depends on the
controller whether its good enough to just add the disks to it, rather
than adding another controller for the second tray. Whether the perc5/e
is good enough for that, I don't know, we've only equipped ours with a
single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
scaled pretty well going from a few to all 14 disks (+1 hotspare).

Best regards,

Arjen

Re: Dell Hardware Recommendations

From
"Joe Uhl"
Date:
Thanks for the input.  Thus far we have used Dell but I would certainly
be willing to explore other options.

I found a "Reference Guide" for the MD1000 from April, 2006 that
includes info on the PERC 5/E at:

http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md1000_solutions_guide.pdf

To answer the questions below:

> How many users do you expect to hit the db at the same time?
There are 2 types of users.  For roughly every 5000 active accounts, 10
or fewer or those will have additional privileges.  Only those more
privileged users interact substantially with the OLAP portion of the
database.  For 1 state 10 concurrent connections was about the max, so
if that holds for 50 states we are looking at 500 concurrent users as a
top end, with a very small fraction of those users interacting with the
OLAP portion.

> How big of a dataset will each one be grabbing at the same time?
For the OLTP data it is mostly single object reads and writes and
generally touches only a few tables at a time.

> Will your Perc RAID controller have a battery backed cache on board?
> If so (and it better!) how big of a cache can it hold?
According to the above link, it has a 256 MB cache that is battery
backed.

> Can you split this out onto two different machines, one for the OLAP
> load and the other for what I'm assuming is OLTP?
> Can you physically partition this out by state if need be?
Right now this system isn't in production so we can explore any option.
We are looking into splitting the OLAP and OLTP portions right now and I
imagine physically splitting the partitions on the big OLAP table is an
option as well.

Really appreciate all of the advice.  Before we pull the trigger on
hardware we probably will get some external advice from someone but I
knew this list would provide some excellent ideas and feedback to get us
started.

Joe Uhl
joeuhl@gmail.com

On Thu, 9 Aug 2007 16:02:49 -0500, "Scott Marlowe"
<scott.marlowe@gmail.com> said:
> On 8/9/07, Joe Uhl <joeuhl@gmail.com> wrote:
> > We have a 30 GB database (according to pg_database_size) running nicely
> > on a single Dell PowerEdge 2850 right now.  This represents data
> > specific to 1 US state.  We are in the process of planning a deployment
> > that will service all 50 US states.
> >
> > If 30 GB is an accurate number per state that means the database size is
> > about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
> > data that is heavy-read but only updated or inserted in batch.  It is
> > also largely isolated to a single table partitioned on state.  This
> > portion of the data will grow very slowly after the initial loading.
> >
> > The remaining 500 GB has frequent individual writes performed against
> > it.  500 GB is a high estimate and it will probably start out closer to
> > 100 GB and grow steadily up to and past 500 GB.
> >
> > I am trying to figure out an appropriate hardware configuration for such
> > a database.  Currently I am considering the following:
> >
> > PowerEdge 1950 paired with a PowerVault MD1000
> > 2 x Quad Core Xeon E5310
> > 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> > PERC 5/E Raid Adapter
> > 2 x 146 GB SAS in Raid 1 for OS + logs.
> > A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
> >
> > The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> > 12 250GB SATA drives to cover the 1.5TB we would be able add another
> > 250GB of usable space for future growth before needing to get a bigger
> > set of disks.  500GB drives would leave alot more room and could allow
> > us to run the MD1000 in split mode and use its remaining disks for other
> > purposes in the mean time.  I would greatly appreciate any feedback with
> > respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> > difference makes SATA awfully appealing.
> >
> > We plan to involve outside help in getting this database tuned and
> > configured, but want to get some hardware ballparks in order to get
> > quotes and potentially request a trial unit.
> >
> > Any thoughts or recommendations?  We are running openSUSE 10.2 with
> > kernel 2.6.18.2-34.
>
> Some questions:
>
> How many users do you expect to hit the db at the same time?
> How big of a dataset will each one be grabbing at the same time?
> Will your Perc RAID controller have a battery backed cache on board?
> If so (and it better!) how big of a cache can it hold?
> Can you split this out onto two different machines, one for the OLAP
> load and the other for what I'm assuming is OLTP?
> Can you physically partition this out by state if need be?
>
> A few comments:
>
> I'd go with the bigger drives.  Just as many, so you have spare
> storage as you need it.  you never know when you'll need to migrate
> your whole data set from one pg db to another for testing etc...
> extra space comes in REAL handy when things aren't quite going right.
> With 10krpm 500 and 750 Gig drives you can use smaller partitions on
> the bigger drives to short stroke them and often outrun supposedly
> faster drives.
>
> The difference between SAS and SATA drives is MUCH less important than
> the difference between one RAID controller and the next.  It's not
> likely the Dell is gonna come with the fastest RAID controllers
> around, as they seem to still be selling Adaptec (buggy and
> unreliable, avoid like the plague) and LSI (stable, moderately fast).
>
> I.e. I'd rather have 24 SATA disks plugged into a couple of big Areca
> or 3ware (now escalade I think?) controllers than 8 SAS drives plugged
> into any Adaptec controller.

Re: Dell Hardware Recommendations

From
"Scott Marlowe"
Date:
oops, the the wrong list...  now the right one.

On 8/9/07, Decibel! <decibel@decibel.org> wrote:
> You forgot the list. :)
>
> On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:
> > On 8/9/07, Decibel! <decibel@decibel.org> wrote:
> >
> > > Also, a good RAID controller can spread reads out across both drives in
> > > each mirror on a RAID10. Though, there is an argument for not doing
> > > that... it makes it much less likely that both drives in a mirror will
> > > fail close enough to each other that you'd lose that chunk of data.
> >
> > I'd think that kind of failure mode is pretty uncommon, unless you're
> > in an environment where physical shocks are common.  which is not a
> > typical database environment.  (tell that to the guys writing a db for
> > a modern tank fire control system though :) )
> >
> > > Speaking of failures, keep in mind that a normal RAID5 puts you only 2
> > > drive failures away from data loss,
> >
> > Not only that, but the first drive failure puts you way down the list
> > in terms of performance, where a single failed drive in a large
> > RAID-10 only marginally affects performance.
> >
> > > while with RAID10 you can
> > > potentially lose half the array without losing any data.
> >
> > Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.
> >
> > > If you do RAID5
> > > with multiple parity copies that does change things; I'm not sure which
> > > is better at that point (I suspect it matters how many drives are
> > > involved).
> >
> > That's RAID6.  The primary advantages of RAID6 over RAID10 or RAID5
> > are two fold:
> >
> > 1:  A single drive failure has no negative effect on performance, so
> > the array is still pretty fast, especially for reads, which just suck
> > under RAID 5 with a missing drive.
> > 2:  No two drive failures can cause loss of data.  Admittedly, by the
> > time the second drive fails, you're now running on the equivalent of a
> > degraded RAID5, unless you've configured >2 drives for parity.
> >
> > On very large arrays (100s of drives), RAID6 with 2, 3, or 4 drives
> > for parity makes some sense, since having that many extra drives means
> > the RAID controller (SW or HW) can now have elections to decide which
> > drive might be lying if you get data corruption.
> >
> > Note that you can also look into RAID10 with 3 or more drives per
> > mirror.  I.e. build 3 RAID-1 sets of 3 drives each, then you can lose
> > any two drives and still stay up.  Plus, on a mostly read database,
> > where users might be reading the same drives but in different places,
> > multi-disk RAID-1 makes sense under RAID-10.
> >
> > While I agree with Merlin that for OLTP a faster drive is a must, for
> > OLAP, more drives is often the real key.  The high aggregate bandwidth
> > of a large array of SATA drives is an amazing thing to watch when
> > running a reporting server with otherwise unimpressive specs.
> >
>
> --
> Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
> Give your computer some brain candy! www.distributed.net Team #1828
>
>

Re: Dell Hardware Recommendations

From
Decibel!
Date:
On Thu, Aug 09, 2007 at 08:58:19PM -0500, Scott Marlowe wrote:
> > On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:
> > > On 8/9/07, Decibel! <decibel@decibel.org> wrote:
> > >
> > > > Also, a good RAID controller can spread reads out across both drives in
> > > > each mirror on a RAID10. Though, there is an argument for not doing
> > > > that... it makes it much less likely that both drives in a mirror will
> > > > fail close enough to each other that you'd lose that chunk of data.
> > >
> > > I'd think that kind of failure mode is pretty uncommon, unless you're
> > > in an environment where physical shocks are common.  which is not a
> > > typical database environment.  (tell that to the guys writing a db for
> > > a modern tank fire control system though :) )

You'd be surprised. I've seen more than one case of a bunch of drives
failing within a month, because they were all bought at the same time.

> > > > while with RAID10 you can
> > > > potentially lose half the array without losing any data.
> > >
> > > Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.

Sure, but the odds of that with RAID5 are 100%, while they're much less
in a RAID10.

> > > While I agree with Merlin that for OLTP a faster drive is a must, for
> > > OLAP, more drives is often the real key.  The high aggregate bandwidth
> > > of a large array of SATA drives is an amazing thing to watch when
> > > running a reporting server with otherwise unimpressive specs.

True. In this case, the OP will probably want to have one array for the
OLTP stuff and one for the OLAP stuff.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

Re: Dell Hardware Recommendations

From
Greg Smith
Date:
On Thu, 9 Aug 2007, Joe Uhl wrote:

> The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> 12 250GB SATA drives to cover the 1.5TB we would be able add another
> 250GB of usable space for future growth before needing to get a bigger
> set of disks.  500GB drives would leave alot more room and could allow
> us to run the MD1000 in split mode and use its remaining disks for other
> purposes in the mean time.  I would greatly appreciate any feedback with
> respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> difference makes SATA awfully appealing.

The SATA II drives in the MD1000 all run at 7200 RPM, and are around
0.8/GB (just grabbed a random quote from the configurator on their site
for all these) for each of the 250GB, 500GB, and 750GB capacities.  If you
couldn't afford to fill the whole array with 500GB models, than it might
make sense to get the 250GB ones instead just to spread the load out over
more spindles; if you're filling it regardless, surely the reduction in
stress over capacity issues of the 500GB models makes more sense.  Also,
using the 500 GB models would make it much easier to only ever use 12
active drives and have 3 hot spares, with less pressure to convert spares
into active storage; drives die in surprisingly correlated batches far too
often to only have 1 spare IMHO.

The two SAS options that you could use are both 300GB, and you can have
10K RPM for $2.3/GB or 15K RPM for $3.0/GB.  So relative to the SATA
optoins, you're paying about 3X as much to get a 40% faster spin rate, or
around 4X as much to get over a 100% faster spin.  There's certainly other
things that factor into performance than just that, but just staring at
the RPM gives you a gross idea how much higher of a raw transaction rate
the drives can support.

The question you have to ask yourself is how much actual I/O are you
dealing with.  The tiny 256MB cache on the PERC 5/E isn't going to help
much with buffering writes in particular, so the raw disk performance may
be critical for your update intensive workload.  If the combination of
transaction rate and total bandwidth are low enough that the 7200 RPM
drives can keep up with your load, by all means save yourself a lot of
cash and get the SATA drives.

In your situation, I'd be spending a lot of my time measuring the
transaction and I/O bandwidth rates on the active system very carefully to
figure out which way to go here.  You're in a better position than most
people buying new hardware to estimate what you need with the existing
system in place, take advantage of that by drilling into the exact numbers
for what you're pushing through your disks now.  Every dollar spent on
work to quantify that early will easily pay for itself in helping guide
your purchase and future plans; that's what I'd be bringing in people in
right now to do if I were you, if that's not something you're already
familiar with measuring.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Dell Hardware Recommendations

From
david@lang.hm
Date:
On Thu, 9 Aug 2007, Decibel! wrote:

> On Thu, Aug 09, 2007 at 08:58:19PM -0500, Scott Marlowe wrote:
>>> On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:
>>>> On 8/9/07, Decibel! <decibel@decibel.org> wrote:
>>>>
>>>>> Also, a good RAID controller can spread reads out across both drives in
>>>>> each mirror on a RAID10. Though, there is an argument for not doing
>>>>> that... it makes it much less likely that both drives in a mirror will
>>>>> fail close enough to each other that you'd lose that chunk of data.
>>>>
>>>> I'd think that kind of failure mode is pretty uncommon, unless you're
>>>> in an environment where physical shocks are common.  which is not a
>>>> typical database environment.  (tell that to the guys writing a db for
>>>> a modern tank fire control system though :) )
>
> You'd be surprised. I've seen more than one case of a bunch of drives
> failing within a month, because they were all bought at the same time.
>
>>>>> while with RAID10 you can
>>>>> potentially lose half the array without losing any data.
>>>>
>>>> Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.
>
> Sure, but the odds of that with RAID5 are 100%, while they're much less
> in a RAID10.

so you go with Raid6, not Raid5.

>>>> While I agree with Merlin that for OLTP a faster drive is a must, for
>>>> OLAP, more drives is often the real key.  The high aggregate bandwidth
>>>> of a large array of SATA drives is an amazing thing to watch when
>>>> running a reporting server with otherwise unimpressive specs.
>
> True. In this case, the OP will probably want to have one array for the
> OLTP stuff and one for the OLAP stuff.

one thing that's interesting is that the I/O throughlut on the large SATA
drives can actually be higher then the faster, but smaller SCSI drives.
the SCSI drives can win on seeking, but how much seeking you need to do
depends on how large the OLTP database ends up being

David Lang

Re: Dell Hardware Recommendations

From
"Merlin Moncure"
Date:
On 8/10/07, Arjen van der Meijden <acmmailing@tweakers.net> wrote:
> On 9-8-2007 23:50 Merlin Moncure wrote:
> > Where the extra controller especially pays off is if you have to
> > expand to a second tray.  It's easy to add trays but installing
> > controllers on a production server is scary.
>
> For connectivity-sake that's not a necessity. You can either connect
> (two?) extra MD1000's to your first MD1000 or you can use the second
> external SAS-port on your controller. Obviously it depends on the
> controller whether its good enough to just add the disks to it, rather
> than adding another controller for the second tray. Whether the perc5/e
> is good enough for that, I don't know, we've only equipped ours with a
> single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
> scaled pretty well going from a few to all 14 disks (+1 hotspare).

completely correct....I was suggesting this on performance
terms...I've never done it with the Perc/5, but have done it with some
active/active SANs and it works really well.

merlin

Re: Dell Hardware Recommendations

From
"Merlin Moncure"
Date:
On 8/10/07, Decibel! <decibel@decibel.org> wrote:
> On Thu, Aug 09, 2007 at 05:50:10PM -0400, Merlin Moncure wrote:
> > Raid 10 is usually better for databases but in my experience it's a
> > roll of the dice.  If you factor cost into the matrix a SAS raid 05
> > might outperform a SATA raid 10 because you are getting better storage
> > utilization out of the drives (n - 2 vs. n / 2).  Then again, you
> > might not.
>
> It's going to depend heavily on the controller and the workload.
> Theoretically, if most of your writes are to stripes that the controller
> already has cached then you could actually out-perform RAID10. But
> that's a really, really big IF, because if the strip isn't in cache you
> have to read the entire thing in before you can do the write... and that
> costs *a lot*.
>
> Also, a good RAID controller can spread reads out across both drives in
> each mirror on a RAID10. Though, there is an argument for not doing
> that... it makes it much less likely that both drives in a mirror will
> fail close enough to each other that you'd lose that chunk of data.
>
> Speaking of failures, keep in mind that a normal RAID5 puts you only 2
> drive failures away from data loss, while with RAID10 you can
> potentially lose half the array without losing any data. If you do RAID5
> with multiple parity copies that does change things; I'm not sure which
> is better at that point (I suspect it matters how many drives are
> involved).

when making hardware recommendations I always suggest to buy two
servers and rig PITR with warm standby.  This allows to adjust the
system a little bit for performance over fault tolerance.

Regarding raid controllers, I've found performance to be quite
variable as stated, especially with regards to RAID 5.  I've also
unfortunately found bonnie++ to not be very reflective of actual
performance in high stress environments.  We have a IBM DS4200 that
bangs out some pretty impressive numbers with our app using sata while
the bonnie++ numbers fairly suck.

merlin

Re: Dell Hardware Recommendations

From
"Merlin Moncure"
Date:
On 8/9/07, Arjen van der Meijden <acmmailing@tweakers.net> wrote:
> On 9-8-2007 23:50 Merlin Moncure wrote:
> > Where the extra controller especially pays off is if you have to
> > expand to a second tray.  It's easy to add trays but installing
> > controllers on a production server is scary.
>
> For connectivity-sake that's not a necessity. You can either connect
> (two?) extra MD1000's to your first MD1000 or you can use the second
> external SAS-port on your controller. Obviously it depends on the
> controller whether its good enough to just add the disks to it, rather
> than adding another controller for the second tray. Whether the perc5/e
> is good enough for that, I don't know, we've only equipped ours with a
> single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
> scaled pretty well going from a few to all 14 disks (+1 hotspare).

As it happens I will have an opportunity to test the dual controller
theory.   In about a week we are picking up another md1000 and will
attach it in an active/active configuration with various
hardware/software RAID configurations, and run a battery of database
centric tests.  Results will follow.

By the way, the recent dell severs I have seen are well built in my
opinion...better and cheaper than comparable IBM servers.  I've also
tested the IBM exp3000, and the MD1000 is cheaper and comes standard
with a second ESM.  In my opinion, the Dell 1U 1950 is extremely well
organized in terms of layout and cooling...dual power supplies, dual
PCI-E (one low profile), plus a third custom slot for the optional
perc 5/i which drives the backplane.

merlin

Re: Dell Hardware Recommendations

From
Vivek Khera
Date:
On Aug 9, 2007, at 3:47 PM, Joe Uhl wrote:

> PowerEdge 1950 paired with a PowerVault MD1000
> 2 x Quad Core Xeon E5310
> 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> PERC 5/E Raid Adapter
> 2 x 146 GB SAS in Raid 1 for OS + logs.
> A bunch of disks in the MD1000 configured in Raid 10 for Postgres
> data.

I'd avoid Dell disk systems if at all possible.  I know, I've been
through the pain. You really want someone else providing your RAID
card and disk array, especially if the 5/E card is based on the
Adaptec devices.


Re: Dell Hardware Recommendations

From
"Merlin Moncure"
Date:
On 8/10/07, Vivek Khera <vivek@khera.org> wrote:
>
> On Aug 9, 2007, at 3:47 PM, Joe Uhl wrote:
>
> > PowerEdge 1950 paired with a PowerVault MD1000
> > 2 x Quad Core Xeon E5310
> > 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> > PERC 5/E Raid Adapter
> > 2 x 146 GB SAS in Raid 1 for OS + logs.
> > A bunch of disks in the MD1000 configured in Raid 10 for Postgres
> > data.
>
> I'd avoid Dell disk systems if at all possible.  I know, I've been
> through the pain. You really want someone else providing your RAID
> card and disk array, especially if the 5/E card is based on the
> Adaptec devices.

I'm not so sure I agree.  They are using LSI firmware now (and so is
everyone else).  The servers are well built (highly subjective, I
admit) and configurable.  I have had some bad experiences with IBM
gear (adaptec controller though), and white box parts 3ware, etc.  I
can tell you that dell got us the storage and the server in record
time

do agree on adaptec however

merlin

Re: Dell Hardware Recommendations

From
"Joel Fradkin"
Date:
I know we bough the 4 proc opteron unit with the sas jbod from dell and it
has been extremely excellent in terms of performance.

Was like 3 times faster the our old dell 4 proc which had xeon processors.

The newer one has had a few issues (I am running redhat as4 since dell
supports it. I have had one kernel failure (but it has been up for like a
year). Other then that no issues a reboot fixed whatever caused the failure
and I have not seen it happen again and its been a few months.

I am definitely going dell for any other server needs their pricing is so
competitive now and the machines I bought both the 1u 2 proc and the larger
4 proc have been very good.

Joel Fradkin



Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305



jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.


-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin Moncure
Sent: Friday, August 10, 2007 1:31 PM
To: Arjen van der Meijden
Cc: Joe Uhl; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Dell Hardware Recommendations

On 8/9/07, Arjen van der Meijden <acmmailing@tweakers.net> wrote:
> On 9-8-2007 23:50 Merlin Moncure wrote:
> > Where the extra controller especially pays off is if you have to
> > expand to a second tray.  It's easy to add trays but installing
> > controllers on a production server is scary.
>
> For connectivity-sake that's not a necessity. You can either connect
> (two?) extra MD1000's to your first MD1000 or you can use the second
> external SAS-port on your controller. Obviously it depends on the
> controller whether its good enough to just add the disks to it, rather
> than adding another controller for the second tray. Whether the perc5/e
> is good enough for that, I don't know, we've only equipped ours with a
> single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
> scaled pretty well going from a few to all 14 disks (+1 hotspare).

As it happens I will have an opportunity to test the dual controller
theory.   In about a week we are picking up another md1000 and will
attach it in an active/active configuration with various
hardware/software RAID configurations, and run a battery of database
centric tests.  Results will follow.

By the way, the recent dell severs I have seen are well built in my
opinion...better and cheaper than comparable IBM servers.  I've also
tested the IBM exp3000, and the MD1000 is cheaper and comes standard
with a second ESM.  In my opinion, the Dell 1U 1950 is extremely well
organized in terms of layout and cooling...dual power supplies, dual
PCI-E (one low profile), plus a third custom slot for the optional
perc 5/i which drives the backplane.

merlin

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org


Re: Dell Hardware Recommendations

From
Vivek Khera
Date:
On Aug 10, 2007, at 4:36 PM, Merlin Moncure wrote:

> I'm not so sure I agree.  They are using LSI firmware now (and so is
> everyone else).  The servers are well built (highly subjective, I
> admit) and configurable.  I have had some bad experiences with IBM
> gear (adaptec controller though), and white box parts 3ware, etc.  I
> can tell you that dell got us the storage and the server in record
> time
>
> do agree on adaptec however

Ok, perhaps you got luckier... I have two PowerVault 220 rack mounts
with U320 SCSI drives in them. With an LSI 320-2X controller, it
*refuses* to recognize some of the drives (channel 1 on either
array).  Dell blames LSI, LSI blames dell's backplane.  This is
consistent across multiple controllers we tried, and two different
Dell disk arrays.  Dropping the SCSI speed to 160 is the only way to
make them work.  I tend to believe LSI here.

The Adaptec 2230SLP controller recognizes the arrays fine, but tends
to "drop" devices at inopportune moments.  Re-seating dropped devices
starts a rebuild, but the speed is recognized as "1" and the rebuild
takes two lifetimes to complete unless you insert a reboot of the
system in there.  Totally unacceptable.  Again, dropping the scsi
rate to 160 seems to make it more stable.


Re: Dell Hardware Recommendations

From
Dave Cramer
Date:
On 13-Aug-07, at 9:50 AM, Vivek Khera wrote:

>
> On Aug 10, 2007, at 4:36 PM, Merlin Moncure wrote:
>
>> I'm not so sure I agree.  They are using LSI firmware now (and so is
>> everyone else).  The servers are well built (highly subjective, I
>> admit) and configurable.  I have had some bad experiences with IBM
>> gear (adaptec controller though), and white box parts 3ware, etc.  I
>> can tell you that dell got us the storage and the server in record
>> time
>>
>> do agree on adaptec however
>
> Ok, perhaps you got luckier... I have two PowerVault 220 rack
> mounts with U320 SCSI drives in them. With an LSI 320-2X
> controller, it *refuses* to recognize some of the drives (channel 1
> on either array).  Dell blames LSI, LSI blames dell's backplane.
> This is consistent across multiple controllers we tried, and two
> different Dell disk arrays.  Dropping the SCSI speed to 160 is the
> only way to make them work.  I tend to believe LSI here.
>
This is the crux of the argument here. Perc/5 is a dell trademark.
They can ship any hardware they want and call it a Perc/5.

Dave
> The Adaptec 2230SLP controller recognizes the arrays fine, but
> tends to "drop" devices at inopportune moments.  Re-seating dropped
> devices starts a rebuild, but the speed is recognized as "1" and
> the rebuild takes two lifetimes to complete unless you insert a
> reboot of the system in there.  Totally unacceptable.  Again,
> dropping the scsi rate to 160 seems to make it more stable.
>
>

> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org