Thread: SSD Drives

SSD Drives

From
Bret Stern
Date:
Any opinions/comments on using SSD drives with postgresql?



Re: SSD Drives

From
Shaun Thomas
Date:
On 04/02/2014 02:37 PM, Bret Stern wrote:

> Any opinions/comments on using SSD drives with postgresql?

Using SSDs with PostgreSQL is fine, provided they have an onboard
capacitor to ensure data integrity. The main concern with SSD drives, is
that they essentially lie about their sync status. There is an inherent
race-condition between the time data reaches the drive, and how long it
takes for the write balancing and NVRAM commit overhead.

Most common drives only have a volatile RAM chip that acts as a buffer
space while writes are synced to the physical drive. Without a capacitor
backing, the state of this buffer is erased on power loss, resulting in
a corrupt database.

There are upcoming technologies which may solve this (see ReRAM) but for
now, it's a requirement for any sane system.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: SSD Drives

From
Brent Wood
Date:

have you seen this?

http://it-blog.5amsolutions.com/2010/08/performance-of-postgresql-ssd-vs.html


Brent Wood

Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nz
NIWA
________________________________________
From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.org] on behalf of Bret Stern [bret_stern@machinemanagement.com]
Sent: Thursday, April 3, 2014 8:37 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] SSD Drives

Any opinions/comments on using SSD drives with postgresql?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Attachment

Re: SSD Drives

From
Shaun Thomas
Date:
On 04/02/2014 02:50 PM, Brent Wood wrote:

> http://it-blog.5amsolutions.com/2010/08/performance-of-postgresql-ssd-vs.html

While interesting, these results are extremely out of date compared to
current drives. Current chips and firmware regularly put out 2-10 times
better performance than even the best graphs on this page, depending on
what you buy.

We moved all of our performance-critical servers to NVRAM-based storage
years ago. For us, it was well worth the added expense.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: SSD Drives

From
Bret Stern
Date:
Care to share the SSD hardware you're using?

I've used none to date, and have some critical data I would like
to put on a development server to test with.

Regards,

Bret Stern

On Wed, 2014-04-02 at 15:31 -0500, Shaun Thomas wrote:
> On 04/02/2014 02:50 PM, Brent Wood wrote:
>
> > http://it-blog.5amsolutions.com/2010/08/performance-of-postgresql-ssd-vs.html
>
> While interesting, these results are extremely out of date compared to
> current drives. Current chips and firmware regularly put out 2-10 times
> better performance than even the best graphs on this page, depending on
> what you buy.
>
> We moved all of our performance-critical servers to NVRAM-based storage
> years ago. For us, it was well worth the added expense.
>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
> 312-676-8870
> sthomas@optionshouse.com
>
> ______________________________________________
>
> See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
>
>




Re: SSD Drives

From
Shaun Thomas
Date:
On 04/02/2014 04:55 PM, Bret Stern wrote:

> Care to share the SSD hardware you're using?

We use these:

http://www.fusionio.com/products/iodrive2/

The older versions of these cards can read faster than a RAID-10 of
80x15k RPM SAS drives, based on our tests from a couple yeas ago. Writes
aren't *quite* as fast, but still much better than even a large RAID array.

They ain't cheap, though. You can expect to pay around $15k USD per TB,
I believe. There are other similar products from other vendors which may
have different cost/performance ratios, but I can only vouch for stuff
I've personally tested.

Our adventure with these cards was a presentation at Postgres Open in
2011. Slides are here:

https://wiki.postgresql.org/images/c/c5/Nvram_fun_profit.pdf

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: SSD Drives

From
Scott Marlowe
Date:
On Wed, Apr 2, 2014 at 4:09 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:
> On 04/02/2014 04:55 PM, Bret Stern wrote:
>
>> Care to share the SSD hardware you're using?
>
>
> We use these:
>
> http://www.fusionio.com/products/iodrive2/
>
> The older versions of these cards can read faster than a RAID-10 of 80x15k
> RPM SAS drives, based on our tests from a couple yeas ago. Writes aren't
> *quite* as fast, but still much better than even a large RAID array.
>
> They ain't cheap, though. You can expect to pay around $15k USD per TB, I
> believe. There are other similar products from other vendors which may have
> different cost/performance ratios, but I can only vouch for stuff I've
> personally tested.
>
> Our adventure with these cards was a presentation at Postgres Open in 2011.
> Slides are here:
>
> https://wiki.postgresql.org/images/c/c5/Nvram_fun_profit.pdf
>

Where I work we use the MLC based FusionIO cards and they are quite
fast. It's actually hard to push them to their max with only 24 or 32
cores in a fast machine. My favorite thing about them is their
fantastic support.


Re: SSD Drives

From
Guy Rouillier
Date:
We used 4x OCZ Deneva 2 in a RAID configuration.  Worked well for us for
over 2 years with no hardware issues.  We switched to SSD because we had
a very write-intensive application (30 million rows/day) that spinning
disks just couldn't keep up with.

On 4/2/2014 6:09 PM, Shaun Thomas wrote:
> On 04/02/2014 04:55 PM, Bret Stern wrote:
>
>> Care to share the SSD hardware you're using?
>
> We use these:
>
> http://www.fusionio.com/products/iodrive2/
>
> The older versions of these cards can read faster than a RAID-10 of
> 80x15k RPM SAS drives, based on our tests from a couple yeas ago. Writes
> aren't *quite* as fast, but still much better than even a large RAID array.
>
> They ain't cheap, though. You can expect to pay around $15k USD per TB,
> I believe. There are other similar products from other vendors which may
> have different cost/performance ratios, but I can only vouch for stuff
> I've personally tested.
>
> Our adventure with these cards was a presentation at Postgres Open in
> 2011. Slides are here:
>
> https://wiki.postgresql.org/images/c/c5/Nvram_fun_profit.pdf
>


--
Guy Rouillier

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com



Re: SSD Drives

From
David Boreham
Date:
While I have two friends who work at FusionIO, and have great confidence
in their products, we like to deploy more conventional SATA SSDs at
present in our servers. We have been running various versions of Intel's
enterprise and data center SSDs in production for several years now and
couldn't be happier with their performance. The oldest in service at
present are 710 series that have been subjected to a ~500wtps PG load
7*24 for the past 28 months. They still show zero wearout indication in
the SMART stats.

As others have mentioned, power-fail protection (supercap) is the thing
to look for, and also some sort of concrete specification for drive
write endurance unless you have made a deliberate decision to trade off
endurance vs. cost in the context of your deployment.






Re: SSD Drives

From
Joe Van Dyk
Date:
On Wed, Apr 2, 2014 at 12:37 PM, Bret Stern <bret_stern@machinemanagement.com> wrote:
Any opinions/comments on using SSD drives with postgresql?

Related, anyone have any thoughts on using postgresql on Amazon's EC2 SSDs?  Been looking at http://aws.amazon.com/about-aws/whats-new/2013/12/19/announcing-the-next-generation-of-amazon-ec2-high-i/o-instance


Re: SSD Drives

From
John R Pierce
Date:
On 4/3/2014 9:26 AM, Joe Van Dyk wrote:
> Related, anyone have any thoughts on using postgresql on Amazon's EC2
> SSDs?  Been looking at
> http://aws.amazon.com/about-aws/whats-new/2013/12/19/announcing-the-next-generation-of-amazon-ec2-high-i/o-instance
>

if your data isn't very important, by all means, keep it on someone
elses virtualized infrastructure with no performance or reliability
guarantees.


--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: SSD Drives

From
Ben Chobot
Date:

On Apr 3, 2014, at 12:47 PM, John R Pierce <pierce@hogranch.com> wrote:

On 4/3/2014 9:26 AM, Joe Van Dyk wrote:
Related, anyone have any thoughts on using postgresql on Amazon's EC2 SSDs?  Been looking at http://aws.amazon.com/about-aws/whats-new/2013/12/19/announcing-the-next-generation-of-amazon-ec2-high-i/o-instance


if your data isn't very important, by all means, keep it on someone elses virtualized infrastructure with no performance or reliability guarantees.

Well that’s not quite fair. AWS guarantees performance for those instances (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/i2-instances.html#i2-instances-diskperf). They also guarantee their instances will fail sooner or later, with or without warning (at which point you will loose all your data unless you’ve been putting copies onto a different system).

Re: SSD Drives

From
Merlin Moncure
Date:
On Wed, Apr 2, 2014 at 2:37 PM, Bret Stern
<bret_stern@machinemanagement.com> wrote:
> Any opinions/comments on using SSD drives with postgresql?

Here's a single S3700 smoking an array of 16 15k drives (poster didn't
realize that; was to focused on synthetic numbers):
http://dba.stackexchange.com/questions/45224/postgres-write-performance-on-intel-s3700-ssd

merlin


Re: SSD Drives

From
David Rees
Date:
On Thu, Apr 3, 2014 at 12:13 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Apr 2, 2014 at 2:37 PM, Bret Stern
> <bret_stern@machinemanagement.com> wrote:
>> Any opinions/comments on using SSD drives with postgresql?
>
> Here's a single S3700 smoking an array of 16 15k drives (poster didn't
> realize that; was to focused on synthetic numbers):
> http://dba.stackexchange.com/questions/45224/postgres-write-performance-on-intel-s3700-ssd

I just ran a quick test earlier this week on an old Dell 2970 (2
Opteron 2387, 16GB RAM) comparing a 6-disk RAID10 with 10k 147GB SAS
disks to a 2-disk RAID1 with 480GB Intel S3500 SSDs and found the SSDs
are about 4-6x faster using pgbench and a scaling factor of 1100. Some
sort of MegaRAID controller according to lspci and has BBU. TPS
numbers below are approximate.

RAID10 disk array:
8 clients: 350 tps
16 clients: 530 tps
32 clients: 800 tps

RAID1 SSD array:
8 clients: 2100 tps
16 clients: 2500 tps
32 clients: 3100 tps

So yeah, even the slower, cheaper S3500 SSDs are way fast. If your
write workload isn't too high, the S3500 can work well. We'll see how
the SMART drive lifetime numbers do once we get into production, but
right now we estimate they should last at least 5 years and from what
we've seen it seems that SSDs seem to wear much better than expected.
If not, we'll pony up and go for the S3700 or perhaps move the xlog
back on to spinning disks.

-Dave


Re: SSD Drives

From
David Rees
Date:
On Thu, Apr 3, 2014 at 12:44 PM, Brent Wood <Brent.Wood@niwa.co.nz> wrote:
> Does the RAID 1 array give any performance benefits over a single drive? I'd guess
> that writes may be slower, reads may be faster (if balanced) but data security is improved.

Unfortunately I didn't test a single drive as that's not a
configuration we would run our systems in. I expect that it would
reduce read performance and thus pgbench results some, but I can't
tell you how much in this case.

-Dave


Re: SSD Drives

From
Scott Marlowe
Date:
On Thu, Apr 3, 2014 at 1:32 PM, David Rees <drees76@gmail.com> wrote:
> On Thu, Apr 3, 2014 at 12:13 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Wed, Apr 2, 2014 at 2:37 PM, Bret Stern
>> <bret_stern@machinemanagement.com> wrote:
>>> Any opinions/comments on using SSD drives with postgresql?
>>
>> Here's a single S3700 smoking an array of 16 15k drives (poster didn't
>> realize that; was to focused on synthetic numbers):
>> http://dba.stackexchange.com/questions/45224/postgres-write-performance-on-intel-s3700-ssd
>
> I just ran a quick test earlier this week on an old Dell 2970 (2
> Opteron 2387, 16GB RAM) comparing a 6-disk RAID10 with 10k 147GB SAS
> disks to a 2-disk RAID1 with 480GB Intel S3500 SSDs and found the SSDs
> are about 4-6x faster using pgbench and a scaling factor of 1100. Some
> sort of MegaRAID controller according to lspci and has BBU. TPS
> numbers below are approximate.
>
> RAID10 disk array:
> 8 clients: 350 tps
> 16 clients: 530 tps
> 32 clients: 800 tps
>
> RAID1 SSD array:
> 8 clients: 2100 tps
> 16 clients: 2500 tps
> 32 clients: 3100 tps
>
> So yeah, even the slower, cheaper S3500 SSDs are way fast. If your
> write workload isn't too high, the S3500 can work well. We'll see how
> the SMART drive lifetime numbers do once we get into production, but
> right now we estimate they should last at least 5 years and from what
> we've seen it seems that SSDs seem to wear much better than expected.
> If not, we'll pony up and go for the S3700 or perhaps move the xlog
> back on to spinning disks.

On a machine with 16 cores with HT (appears as 32 cores) and 8 of the
3700 series Intel SSDs in a RAID-10 under an LSI MegaRAID with BBU, I
was able to get 6300 to 7500 tps on a decent sized pgbench db
(-s1000).


Re: SSD Drives

From
Bret Stern
Date:
On Thu, 2014-04-03 at 12:32 -0700, David Rees wrote:
> On Thu, Apr 3, 2014 at 12:13 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> > On Wed, Apr 2, 2014 at 2:37 PM, Bret Stern
> > <bret_stern@machinemanagement.com> wrote:
> >> Any opinions/comments on using SSD drives with postgresql?
> >
> > Here's a single S3700 smoking an array of 16 15k drives (poster didn't
> > realize that; was to focused on synthetic numbers):
> > http://dba.stackexchange.com/questions/45224/postgres-write-performance-on-intel-s3700-ssd
>
> I just ran a quick test earlier this week on an old Dell 2970 (2
> Opteron 2387, 16GB RAM) comparing a 6-disk RAID10 with 10k 147GB SAS
> disks to a 2-disk RAID1 with 480GB Intel S3500 SSDs and found the SSDs
> are about 4-6x faster using pgbench and a scaling factor of 1100. Some
> sort of MegaRAID controller according to lspci and has BBU. TPS
> numbers below are approximate.
>
> RAID10 disk array:
> 8 clients: 350 tps
> 16 clients: 530 tps
> 32 clients: 800 tps
>
> RAID1 SSD array:
> 8 clients: 2100 tps
> 16 clients: 2500 tps
> 32 clients: 3100 tps
>
> So yeah, even the slower, cheaper S3500 SSDs are way fast. If your
> write workload isn't too high, the S3500 can work well.

Is a write cycle anywhere on the drive different than a re-write?

Or is a write a write!

They feedback/comments are awesome. I'm shopping..


> We'll see how
> the SMART drive lifetime numbers do once we get into production, but
> right now we estimate they should last at least 5 years and from what
> we've seen it seems that SSDs seem to wear much better than expected.
> If not, we'll pony up and go for the S3700 or perhaps move the xlog
> back on to spinning disks.
>
> -Dave




Re: SSD Drives

From
John R Pierce
Date:
On 4/3/2014 12:32 PM, David Rees wrote:
> So yeah, even the slower, cheaper S3500 SSDs are way fast. If your
> write workload isn't too high, the S3500 can work well. We'll see how
> the SMART drive lifetime numbers do once we get into production, but
> right now we estimate they should last at least 5 years and from what
> we've seen it seems that SSDs seem to wear much better than expected.
> If not, we'll pony up and go for the S3700 or perhaps move the xlog
> back on to spinning disks.

an important thing in getting decent wear leveling life with SSDs is to
keep them under about 70% full.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: SSD Drives

From
David Boreham
Date:
On 4/3/2014 2:00 PM, John R Pierce wrote:
>
> an important thing in getting decent wear leveling life with SSDs is
> to keep them under about 70% full.
>

This depends on the drive : drives with higher specified write endurance
already have significant overprovisioning, before the user sees the space.






Re: SSD Drives

From
Merlin Moncure
Date:
On Thu, Apr 3, 2014 at 2:53 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On a machine with 16 cores with HT (appears as 32 cores) and 8 of the
> 3700 series Intel SSDs in a RAID-10 under an LSI MegaRAID with BBU, I
> was able to get 6300 to 7500 tps on a decent sized pgbench db
> (-s1000).

Did you happen to grab any 'select only' numbers?

merlin


Re: SSD Drives

From
Brent Wood
Date:

Hi David,

Does the RAID 1 array give any performance benefits over a single drive? I'd guess that writes may be slower, reads may be faster (if balanced) but data security is improved.

Brent Wood

Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | www.niwa.co.nz
NIWA
________________________________________
From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.org] on behalf of David Rees [drees76@gmail.com]
Sent: Friday, April 4, 2014 8:32 AM
To: Merlin Moncure
Cc: bret_stern@machinemanagement.com; PostgreSQL General
Subject: Re: [GENERAL] SSD Drives

On Thu, Apr 3, 2014 at 12:13 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Apr 2, 2014 at 2:37 PM, Bret Stern
> <bret_stern@machinemanagement.com> wrote:
>> Any opinions/comments on using SSD drives with postgresql?
>
> Here's a single S3700 smoking an array of 16 15k drives (poster didn't
> realize that; was to focused on synthetic numbers):
> http://dba.stackexchange.com/questions/45224/postgres-write-performance-on-intel-s3700-ssd

I just ran a quick test earlier this week on an old Dell 2970 (2
Opteron 2387, 16GB RAM) comparing a 6-disk RAID10 with 10k 147GB SAS
disks to a 2-disk RAID1 with 480GB Intel S3500 SSDs and found the SSDs
are about 4-6x faster using pgbench and a scaling factor of 1100. Some
sort of MegaRAID controller according to lspci and has BBU. TPS
numbers below are approximate.

RAID10 disk array:
8 clients: 350 tps
16 clients: 530 tps
32 clients: 800 tps

RAID1 SSD array:
8 clients: 2100 tps
16 clients: 2500 tps
32 clients: 3100 tps

So yeah, even the slower, cheaper S3500 SSDs are way fast. If your
write workload isn't too high, the S3500 can work well. We'll see how
the SMART drive lifetime numbers do once we get into production, but
right now we estimate they should last at least 5 years and from what
we've seen it seems that SSDs seem to wear much better than expected.
If not, we'll pony up and go for the S3700 or perhaps move the xlog
back on to spinning disks.

-Dave


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Attachment

Re: SSD Drives

From
Scott Marlowe
Date:
On Thu, Apr 3, 2014 at 3:28 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Apr 3, 2014 at 2:53 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On a machine with 16 cores with HT (appears as 32 cores) and 8 of the
>> 3700 series Intel SSDs in a RAID-10 under an LSI MegaRAID with BBU, I
>> was able to get 6300 to 7500 tps on a decent sized pgbench db
>> (-s1000).
>
> Did you happen to grab any 'select only' numbers?

Darnit. Nope. I'll try to grab some on a spare box if I get one again.
Now they're all in production so running pgbench is kind of frowned
upon.


Re: SSD Drives

From
Scott Marlowe
Date:
On Thu, Apr 3, 2014 at 1:44 PM, Brent Wood <Brent.Wood@niwa.co.nz> wrote:
>
> Hi David,
>
> Does the RAID 1 array give any performance benefits over a single drive? I'd guess that writes may be slower, reads
maybe faster (if balanced) but data security is improved. 

I did some testing on machines with 3xMLC FusionIO Drive2s with 1.2TB.
Comparing 1 drive and 2 drives in RAID-1 the difference in performance
was minimal. However, a 3 drive mirror was noticeably slower. This was
all with ubuntu 12.04 using 3.8.latest kernel and software RAID.
RAID-0 was by far the fastest, about 30% faster than either a single
or a pair of drives in RAID-1


Re: SSD Drives

From
Steve Crawford
Date:
On 04/03/2014 12:44 PM, Brent Wood wrote:
P.ImprintUniqueID {MARGIN: 0cm 0cm 0pt } LI.ImprintUniqueID {MARGIN: 0cm 0cm 0pt } DIV.ImprintUniqueID {MARGIN: 0cm 0cm 0pt } TABLE.ImprintUniqueIDTable {MARGIN: 0cm 0cm 0pt } DIV.Section1 {page: Section1 } .ExternalClass * {LINE-HEIGHT: 100% }
Hi David,

Does the RAID 1 array give any performance benefits over a single drive? I'd guess that writes may be slower, reads may be faster (if balanced) but data security is improved.

I've been looking into upgrading to SSD and wondering about RAID and where to apply $$$ as well. In particular I'm curious about any real-world PostgreSQL-oriented performance and data-protections advice in the following areas:

1. With SSDs being orders of magnitude faster than spinning media, when does the RAID controller rather than the storage become the bottleneck?

2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one? Which one? I'm suspecting capacitor on the SSD and write-through on the RAID.

2. Current thoughts on hardware vs. software RAID - especially since many of the current SSD solutions plug straight into the bus.

3. Potential issues or conflicts with SSD-specific requirements like TRIM.

4. Manufacturers, models or technologies to seek out or avoid.

5. At what point do we consider the RAID controller an additional SPOF that decreases instead of increases reliability?

6. Thoughts on "best bang for the buck?" For example, am I better off dropping the RAID cards and additional drives and instead adding another standby server?

Cheers,
Steve

Re: SSD Drives

From
David Boreham
Date:
It would be useful to know more details -- how much storage space you
need for example.

fwiw I considered all of these issues when we first deployed SSDs and
decided to not use RAID controllers.
There have not been any reasons to re-think that decision since.
However, it depends on your specific
needs I think. We prefer to think in terms of a single machine as the
unit of service failure -- a machine
is either working, or not working, and we ensure state is replicated to
several machines for durability.
Therefore a storage solution on each machine that is more reliable than
the machine itself is not useful.

In our deployments we can't max out even one SSD, so there isn't
anything a RAID controller can
add  in terms of performance, but your case could be different.

You might also want to consider the power dissipated by the RAID
controller : I was quite surprised by how much heat they generate, but
this was a couple of years ago. Possibly there are lower power
controllers available now.

You need the capacitor on the SSD -- a RAID controller with BBU will not
fix a non-power-fail-safe SSD.

On 4/4/2014 10:04 AM, Steve Crawford wrote:
>
> I've been looking into upgrading to SSD and wondering about RAID and
> where to apply $$$ as well. In particular I'm curious about any
> real-world PostgreSQL-oriented performance and data-protections advice
> in the following areas:
>
> 1. With SSDs being orders of magnitude faster than spinning media,
> when does the RAID controller rather than the storage become the
> bottleneck?
>
> 2. Do I need both BBU on the RAID *and* capacitor on the SSD or just
> on one? Which one? I'm suspecting capacitor on the SSD and
> write-through on the RAID.
>
> 2. Current thoughts on hardware vs. software RAID - especially since
> many of the current SSD solutions plug straight into the bus.
>
> 3. Potential issues or conflicts with SSD-specific requirements like TRIM.
>
> 4. Manufacturers, models or technologies to seek out or avoid.
>
> 5. At what point do we consider the RAID controller an additional SPOF
> that decreases instead of increases reliability?
>
> 6. Thoughts on "best bang for the buck?" For example, am I better off
> dropping the RAID cards and additional drives and instead adding
> another standby server?



Re: SSD Drives

From
Merlin Moncure
Date:
On Fri, Apr 4, 2014 at 11:04 AM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
> On 04/03/2014 12:44 PM, Brent Wood wrote:
>
> Hi David,

My take:

> Does the RAID 1 array give any performance benefits over a single drive? I'd
> guess that writes may be slower, reads may be faster (if balanced) but data
> security is improved.

Probably not so much for SSD drives. Read and write performance are
very unbalanced in SSD and RAID1 doesn't help with writes.

> I've been looking into upgrading to SSD and wondering about RAID and where
> to apply $$$ as well. In particular I'm curious about any real-world
> PostgreSQL-oriented performance and data-protections advice in the following
> areas:
>
> 1. With SSDs being orders of magnitude faster than spinning media, when does
> the RAID controller rather than the storage become the bottleneck?

SSD (at least the good ones) are maybe order of magnitude faster on
writes.  Can be less or more depending on the application write
particulars.  SSD are 2-3 orders faster for reads.

> 2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
> Which one? I'm suspecting capacitor on the SSD and write-through on the
> RAID.

You need both. The capacitor protects the drive, the BBU protects the
raid controller.

> 2. Current thoughts on hardware vs. software RAID - especially since many of
> the current SSD solutions plug straight into the bus.

IMNSHO, software raid is a better bet.  The advantages are compelling:
Cost, TRIM support, etc. and the SSD drives do not benefit as much
from the write cache.   But hardware controllers offer very fast burst
write performance which is nice.

> 3. Potential issues or conflicts with SSD-specific requirements like TRIM.

TRIM is not essential but does help.  Pretty much all hardware raid
controllers do not support TRIM.  I've been waiting for a controller
that manages TRIM and other SSD stuff (like consolidated wear
leveling) across an entire array but so far nothing has really
materialized.  If it does happen it will probably come from intel.

> 4. Manufacturers, models or technologies to seek out or avoid.

Avoid consumer grade/enthusiast stuff, and anything that does not have
a capacitor.  Intel offerings tend to be the benchmark.

> 5. At what point do we consider the RAID controller an additional SPOF that
> decreases instead of increases reliability?
>
> 6. Thoughts on "best bang for the buck?" For example, am I better off
> dropping the RAID cards and additional drives and instead adding another
> standby server?

This is going to depend a lot on write patterns.  If you don't do much
writing, you can gear up accordingly.  For all around performance, the
S3700 (2.5$/gb) IMO held the crown for most of 2013 and I think is
still the one to buy.  The s3500 (1.25$/gb) came out and also looks
like a pretty good deal, and there are some decent competitors (600
pro for example).  If you're willing to spend more, there are a lot of
other options.  I don't think it's reasonable to spend less for a
write heavy application.

merlin


Re: SSD Drives

From
John R Pierce
Date:
On 4/4/2014 10:15 AM, Merlin Moncure wrote:
2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
> Which one? I'm suspecting capacitor on the SSD and write-through on the
> RAID.
You need both. The capacitor protects the drive, the BBU protects the
raid controller.


note BBU's on raid cards are being replaced by 'flash-back' which is a supercap and flash memory backup for the raid card's write-back cache. 



-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: SSD Drives

From
Scott Marlowe
Date:
On Fri, Apr 4, 2014 at 11:15 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Apr 4, 2014 at 11:04 AM, Steve Crawford
> <scrawford@pinpointresearch.com> wrote:
>> On 04/03/2014 12:44 PM, Brent Wood wrote:

>> 2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
>> Which one? I'm suspecting capacitor on the SSD and write-through on the
>> RAID.
>
> You need both. The capacitor protects the drive, the BBU protects the
> raid controller.

You don't technically need the BBU / flashback memory IF the
controller is in write through. My experience has been that the BBU
helps a lot on write heavy applications or to get maximum performance
for your money. On most cards, it's < $100 so unless you can
definitively show no real performance loss without one, get one. OTOH
it's worth testing to be sure. But the BBU does a lot to reorder
writes and such and flattens out bursty write performance very well.
It also speeds up checkpointing if / when it has to occur.


Re: SSD Drives

From
John R Pierce
Date:
On 4/4/2014 12:08 PM, Scott Marlowe wrote:
> You don't technically need the BBU / flashback memory IF the
> controller is in write through.

if you HAVE the BBU/flash why would you put the controller in write
through??  the whole POINT of bbu/flashback is that you can safely
enable writeback caching.

my testing with postgresql OLTP benchmarks on Linux, I've found
virtually identical performance using mdraid vs hardware raid in the
same caching mode.  its the writeback cache that gives raid cards like
the LSI Megaraid SAS2 series, or HP P420, or whatever, their big
advantage vs a straight JBOD configuration.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: SSD Drives

From
David Rees
Date:
On Fri, Apr 4, 2014 at 10:15 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> For all around performance, the
> S3700 (2.5$/gb) IMO held the crown for most of 2013 and I think is
> still the one to buy.  The s3500 (1.25$/gb) came out and also looks
> like a pretty good deal

The S3500 can be had for $1.00/GB now these days. If you don't need
the write durability or the all-out write performance of the S3700,
it's a great deal.

I do have to wonder if hardware RAID with a BBU can help with write
amplification when on SSDs. Though since RHEL/CentOS 6.5 supports trim
in software raid, that could be a bigger win.

-Dave


Re: SSD Drives

From
Scott Marlowe
Date:
On Fri, Apr 4, 2014 at 1:18 PM, John R Pierce <pierce@hogranch.com> wrote:
> On 4/4/2014 12:08 PM, Scott Marlowe wrote:
>>
>> You don't technically need the BBU / flashback memory IF the
>> controller is in write through.
>
>
> if you HAVE the BBU/flash why would you put the controller in write
> through??  the whole POINT of bbu/flashback is that you can safely enable
> writeback caching.
>
> my testing with postgresql OLTP benchmarks on Linux, I've found virtually
> identical performance using mdraid vs hardware raid in the same caching
> mode.  its the writeback cache that gives raid cards like the LSI Megaraid
> SAS2 series, or HP P420, or whatever, their big advantage vs a straight JBOD
> configuration.

I'm not sure you read / got the whole conversation. The OP was asking
if he COULD use a RAID controller with no BBU in write through with
SSDs. It's a valid question. My main point was in answer to this
response:

On Fri, Apr 4, 2014 at 11:15 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Apr 4, 2014 at 11:04 AM, Steve Crawford
>> 2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
>> Which one? I'm suspecting capacitor on the SSD and write-through on the
>> RAID.
>
> You need both. The capacitor protects the drive, the BBU protects the
> raid controller.

Context is king here. You do not have to have a BBU as long as you are
in write through as the OP mentioned. With no BBU, in write-through,
with supercaps, you should be safe. It's not a sensible configuration
for most applications. OTOH, most HW RAIDs have auto spare promotion
and easy swap out of dead drives with auto-rebuild. So if you're
building 1000 units for the government that just plug in and work, you
want the poor guy on the other end to just unplug bad drives and
replace them. The cost of a service call could be way more than a HW
RAID card.

So, there are plenty of reasons you might want to test or even run
without a BBU. That wasn't my point. My point was you're SAFE (or
should be) with a HW RAID no BBU and supercapped SSDs.


Re: SSD Drives

From
Merlin Moncure
Date:


On Friday, April 4, 2014, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Apr 4, 2014 at 1:18 PM, John R Pierce <pierce@hogranch.com> wrote:
> On 4/4/2014 12:08 PM, Scott Marlowe wrote:
>>
>> You don't technically need the BBU / flashback memory IF the
>> controller is in write through.
>
>
> if you HAVE the BBU/flash why would you put the controller in write
> through??  the whole POINT of bbu/flashback is that you can safely enable
> writeback caching.
>
> my testing with postgresql OLTP benchmarks on Linux, I've found virtually
> identical performance using mdraid vs hardware raid in the same caching
> mode.  its the writeback cache that gives raid cards like the LSI Megaraid
> SAS2 series, or HP P420, or whatever, their big advantage vs a straight JBOD
> configuration.

I'm not sure you read / got the whole conversation. The OP was asking
if he COULD use a RAID controller with no BBU in write through with
SSDs. It's a valid question. My main point was in answer to this
response:

On Fri, Apr 4, 2014 at 11:15 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Apr 4, 2014 at 11:04 AM, Steve Crawford
>> 2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
>> Which one? I'm suspecting capacitor on the SSD and write-through on the
>> RAID.
>
> You need both. The capacitor protects the drive, the BBU protects the
> raid controller.

Context is king here. You do not have to have a BBU as long as you are
in write through as the OP mentioned. With no BBU, in write-through,
with supercaps, you should be safe. It's not a sensible configuration
for most applications. OTOH, most HW RAIDs have auto spare promotion
and easy swap out of dead drives with auto-rebuild. So if you're
building 1000 units for the government that just plug in and work, you
want the poor guy on the other end to just unplug bad drives and
replace them. The cost of a service call could be way more than a HW
RAID card.
 
So, there are plenty of reasons you might want to test or even run
without a BBU. That wasn't my point. My point was you're SAFE (or
should be) with a HW RAID no BBU and supercapped SSDs.

Agreed on all points.  At the end of the day though hw raid is debatable in terms of value with ssd. 

I like mdadm more than most ulilitues also.

merlin

 

Re: SSD Drives

From
Steve Crawford
Date:
On 04/04/2014 10:15 AM, Merlin Moncure wrote:
>> 2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
>> Which one? I'm suspecting capacitor on the SSD and write-through on the
>> RAID.
> You need both. The capacitor protects the drive, the BBU protects the
> raid controller.
?? In write-through the controller shouldn't return success until it
gets it from the drive so no BBU should be required. One LSI slide deck
recommends write-back as the optimum policy for SSDs. But I could be
wrong which is why I ask.
>> 2. Current thoughts on hardware vs. software RAID - especially since many of
>> the current SSD solutions plug straight into the bus.
> IMNSHO, software raid is a better bet.  The advantages are compelling:
> Cost, TRIM support, etc. and the SSD drives do not benefit as much
> from the write cache.   But hardware controllers offer very fast burst
> write performance which is nice.
>
> 6. Thoughts on "best bang for the buck?" For example, am I better off
> dropping the RAID cards and additional drives and instead adding another
> standby server?
> This is going to depend a lot on write patterns.  If you don't do much
> writing, you can gear up accordingly.  For all around performance, the
> S3700 (2.5$/gb) IMO held the crown for most of 2013 and I think is
> still the one to buy.  The s3500 (1.25$/gb) came out and also looks
> like a pretty good deal, and there are some decent competitors (600
> pro for example).  If you're willing to spend more, there are a lot of
> other options.  I don't think it's reasonable to spend less for a
> write heavy application.

FWIW, the workload is somewhat over 50% writes and currently peaks at
~1,600 queries/second after excluding "set" statements. This is
currently spread across four 15k SATA drives in RAID 10.

Judicious archiving allows us to keep our total OS+data storage
requirements under 100GB. Usually. So we should be able to easily stay
in the $500/drive price range (200GB S3700) and still have plenty of
headroom for wear-leveling.

One option I'm considering is no RAID at all but spend the savings from
the controllers and extra drives toward an additional standby server.

Cheers,
Steve



Re: SSD Drives

From
David Boreham
Date:
On 4/4/2014 3:57 PM, Steve Crawford wrote:
> Judicious archiving allows us to keep our total OS+data storage
> requirements under 100GB. Usually. So we should be able to easily stay
> in the $500/drive price range (200GB S3700) and still have plenty of
> headroom for wear-leveling.
>
> One option I'm considering is no RAID at all but spend the savings
> from the controllers and extra drives toward an additional standby
> server.

This very similar to our workload. We use a single 200G or 300G Intel
SSD per machine, directly attached to the motherboard SATA controller.
No RAID controller.

We run 7 servers at present in this configuration in a single cluster.
Roughly 120W per box peak (8-core, 64G RAM).




Re: SSD Drives

From
Lists
Date:
On 04/02/2014 02:55 PM, Bret Stern wrote:
> Care to share the SSD hardware you're using?
>
> I've used none to date, and have some critical data I would like
> to put on a development server to test with.
>
> Regards,
>
> Bret Stern

SSDs are ridiculously cheap when you consider the performance
difference. We saw at *least* a 10x improvement in performance going
with SATA SSDs vs. 10k SAS drives in a messy, read/write environment.
(most of our tests were 20x or more) It's a no-brainer for us.

It might be tempting to use a consumer-grade SSD due to the significant
cost savings, but the money saved is vapor. They may be OK for a dev
environment, but you *will* pay in downtime in a production environment.
Unlike regular hard drives where the difference between consumer and
enterprise drives is performance and a few features, SSDs are different
animals.

SSDs wear something like a salt-shaker. There's a fairly definite number
of writes that they are good for, and when they are gone, the drive will
fail. Like a salt shaker, when the salt is gone, you won't get salt any
more no matter how you shake it.

So, spend the money and get the enterprise class SSDs. They have come
down considerably in price over the last year or so. Although on paper
the Intel Enterprise SSDs tend to trail the performance numbers of the
leading consumer drives, they have wear characteristics that mean you
can trust them as much as you can any other drive for years, and they
still leave spinning rust far, far behind.

Our production servers are 4x 1U rackmounts with 32 cores, 128 GB of ECC
RAM, and SW RAID1 400 GB SSDs in each. We back up all our databases
hourly, with peak volume around 200-300 QPS/server with a write ratio of
perhaps 40%, and a iostat disk utilization at about 10-20% in 5 second
intervals.

-Ben


Re: SSD Drives

From
James Harper
Date:
>
> It might be tempting to use a consumer-grade SSD due to the significant
> cost savings, but the money saved is vapor. They may be OK for a dev
> environment, but you *will* pay in downtime in a production environment.
> Unlike regular hard drives where the difference between consumer and
> enterprise drives is performance and a few features, SSDs are different
> animals.
>
> SSDs wear something like a salt-shaker. There's a fairly definite number
> of writes that they are good for, and when they are gone, the drive will
> fail. Like a salt shaker, when the salt is gone, you won't get salt any
> more no matter how you shake it.
>

In theory, SMART is supposed to be a reliable indicator of impending "salt exhaustion". Have you had any drives "run
outof salt" where SMART did not let you know in advance? If SMART does actually perform as expected there should be no
downtime,just swap of the drive in the array and wait for the rebuild. I'd expect the cheapest consumer drives to fail
suddenlyand without warning, but I've never had cause to find out so far... 

James


Re: SSD Drives

From
Scott Marlowe
Date:
On Fri, Apr 4, 2014 at 5:29 PM, Lists <lists@benjamindsmith.com> wrote:
> On 04/02/2014 02:55 PM, Bret Stern wrote:
>>
>> Care to share the SSD hardware you're using?
>>
>> I've used none to date, and have some critical data I would like
>> to put on a development server to test with.
>>
>> Regards,
>>
>> Bret Stern
>
>
> SSDs are ridiculously cheap when you consider the performance difference. We
> saw at *least* a 10x improvement in performance going with SATA SSDs vs. 10k
> SAS drives in a messy, read/write environment. (most of our tests were 20x
> or more) It's a no-brainer for us.
>
> It might be tempting to use a consumer-grade SSD due to the significant cost
> savings, but the money saved is vapor. They may be OK for a dev environment,
> but you *will* pay in downtime in a production environment. Unlike regular
> hard drives where the difference between consumer and enterprise drives is
> performance and a few features, SSDs are different animals.
>
> SSDs wear something like a salt-shaker. There's a fairly definite number of
> writes that they are good for, and when they are gone, the drive will fail.
> Like a salt shaker, when the salt is gone, you won't get salt any more no
> matter how you shake it.

The real danger with consumer drives is they don't have supercaps and
can and will therefore corrupt your data on power failure. The actual
write cycles aren't a big deal for many uses, as now even consumer
drives have very long write cycle lives.


Re: SSD Drives

From
David Rees
Date:
On Fri, Apr 4, 2014 at 5:20 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> The real danger with consumer drives is they don't have supercaps and
> can and will therefore corrupt your data on power failure. The actual
> write cycles aren't a big deal for many uses, as now even consumer
> drives have very long write cycle lives.

Don't forget about the Crucial M500, M550 and Samsung 840 Pro - those
all have power loss protection, though have other drawbacks. The
Crucial drives in particular don't expose any sort of wear status
through SMART.

-Dave


Re: SSD Drives

From
David Boreham
Date:
On 4/4/2014 5:29 PM, Lists wrote:
> So, spend the money and get the enterprise class SSDs. They have come
> down considerably in price over the last year or so. Although on paper
> the Intel Enterprise SSDs tend to trail the performance numbers of the
> leading consumer drives, they have wear characteristics that mean you
> can trust them as much as you can any other drive for years, and they
> still leave spinning rust far, far behind.

Another issue to bear in mind is that SSD performance may not be
consistent over time. This is because the software on the drive that
manages where data lives in the NAND chips has to perform operations
similar to garbage collection. Drive performance may slowly decrease
over the lifetime of the drive, or worse : Consumer drives may be
designed such that this GC-like activity is expected to take place "when
the drive is idle", which it may well be for much of the time, in a
laptop. However, in a server subject to a constant load, there may never
be "idle time". As a result the drive may all of a sudden decide to stop
processing host I/O operations while it reshuffles its blocks.
Enterprise drives are designed to address this problem and are specified
for longevity under a constant high workload. Performance is similarly
specified over worst-case lifetime conditions (which could explain why
consumer drives appear to be faster, at least initially).







Re: SSD Drives

From
Scott Marlowe
Date:
On Sat, Apr 5, 2014 at 9:13 AM, David Boreham <david_list@boreham.org> wrote:
> On 4/4/2014 5:29 PM, Lists wrote:
>>
>> So, spend the money and get the enterprise class SSDs. They have come down
>> considerably in price over the last year or so. Although on paper the Intel
>> Enterprise SSDs tend to trail the performance numbers of the leading
>> consumer drives, they have wear characteristics that mean you can trust them
>> as much as you can any other drive for years, and they still leave spinning
>> rust far, far behind.
>
>
> Another issue to bear in mind is that SSD performance may not be consistent
> over time. This is because the software on the drive that manages where data
> lives in the NAND chips has to perform operations similar to garbage
> collection. Drive performance may slowly decrease over the lifetime of the
> drive, or worse : Consumer drives may be designed such that this GC-like
> activity is expected to take place "when the drive is idle", which it may
> well be for much of the time, in a laptop. However, in a server subject to a
> constant load, there may never be "idle time". As a result the drive may all
> of a sudden decide to stop processing host I/O operations while it
> reshuffles its blocks. Enterprise drives are designed to address this
> problem and are specified for longevity under a constant high workload.
> Performance is similarly specified over worst-case lifetime conditions
> (which could explain why consumer drives appear to be faster, at least
> initially).

Good points as well. This brings us to the area of trim support. Trim
support is fairly common on most modern-ish linux kernels. There were
some nasty data corruption bugs if you added discard to your mount
options in older kernels (2.6 series etc) and one or two found and
squashed since then. But the real issue is that mdraid doesn't pass
down the trim commands from discard until kernel version 3.8. If
you're running on an older kernel you get no trim support with SATA
SSDs and mdraid arrays. ext3 doesn't support trim, and there are also
some known bugs for filesystems converted form ext3 to ext4.

On top of that most RAID controllers don't support any form of trim.
All of these things need to be considered when implementing SSD
storage. FusionIO drives btw, DO support / pass trim when mounted with
the discard option and running a fs that supports it like ext4.
Overprovisioning regular SSDs on either a RAID controller or older
kernels with mdraid is usually enough to keep performance up over the
life of the drive, but performance monitoring can let you know if the
drives are slowly getting slower as they're used month after month.


Re: SSD Drives

From
John R Pierce
Date:
On 4/5/2014 8:13 AM, David Boreham wrote:
> On 4/4/2014 5:29 PM, Lists wrote:
>> So, spend the money and get the enterprise class SSDs. They have come
>> down considerably in price over the last year or so. Although on
>> paper the Intel Enterprise SSDs tend to trail the performance numbers
>> of the leading consumer drives, they have wear characteristics that
>> mean you can trust them as much as you can any other drive for years,
>> and they still leave spinning rust far, far behind.
>
> Another issue to bear in mind is that SSD performance may not be
> consistent over time. This is because the software on the drive that
> manages where data lives in the NAND chips has to perform operations
> similar to garbage collection. Drive performance may slowly decrease
> over the lifetime of the drive, or worse : Consumer drives may be
> designed such that this GC-like activity is expected to take place
> "when the drive is idle", which it may well be for much of the time,
> in a laptop. However, in a server subject to a constant load, there
> may never be "idle time". As a result the drive may all of a sudden
> decide to stop processing host I/O operations while it reshuffles its
> blocks. Enterprise drives are designed to address this problem and are
> specified for longevity under a constant high workload. Performance is
> similarly specified over worst-case lifetime conditions (which could
> explain why consumer drives appear to be faster, at least initially).

My experience has been,  consumer SSDs used in a high usage desktop type
environment are about twice as slow after a year as they were brand
new.   I note my current desktop system has written 15TB total onto my
250GB drive after about 16 months.  The SMART wear leveling count
suggests the drive has 91% of its useful life left.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: SSD Drives

From
Vick Khera
Date:
On Thu, Apr 3, 2014 at 4:00 PM, John R Pierce <pierce@hogranch.com> wrote:
> an important thing in getting decent wear leveling life with SSDs is to keep
> them under about 70% full.

You have to do that at provisioning time in the drive. Ie, once you
layer a file system on it, the drive doesn't know what's "empty" and
what's not; you have to tell it before hand that only show X% to the
system, and keep the rest for wear leveling. I don't know the tools
for doing it, as my vendor takes care of that for me.