Thread: Solid State Drives with PG (was: in RAM DB)

Solid State Drives with PG (was: in RAM DB)

From
Alan McKay
Date:
> Have you considered using one of these:
> http://www.acard.com/english/fb01-product.jsp?idno_no=270&prod_no=ANS-9010&type1_title=
> Solid State Drive&type1_idno=13

We did some research which suggested that performance may not be so
great with them because the PG engine is not optimized to utilize
those drives.

So, I'll change the subject line to see if anyone has experience using these.


--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: Solid State Drives with PG (was: in RAM DB)

From
Merlin Moncure
Date:
On Fri, Mar 26, 2010 at 10:32 AM, Alan McKay <alan.mckay@gmail.com> wrote:
>> Have you considered using one of these:
>> http://www.acard.com/english/fb01-product.jsp?idno_no=270&prod_no=ANS-9010&type1_title=
>> Solid State Drive&type1_idno=13
>
> We did some research which suggested that performance may not be so
> great with them because the PG engine is not optimized to utilize
> those drives.
>
> So, I'll change the subject line to see if anyone has experience using these.

postgres works fine with flash SSD, understanding that:
*) postgres disk block is 8k and ssd erase block is much larger (newer
ssd controllers minimize this penalty though)
*) many flash drives cheat and buffer writes to delay full sync, for
performance reasons and to extend the life of the drive
*) if you have a relatively small database, the big 'win' off SSD,
fast random reads, is of little/no use because the o/s will buffer the
database in ram anywys.

The ideal candidate for flash SSD from database point of view is one
who is having I/O problems coming from OLTP type activity forcing the
disks  to constantly seek all over the place to write and (especially)
read data.  This happens when your database grows to the point when
its OPERATIONAL (that is, frequently used) data size exceeds ram to a
certain extent and o/s buffering of reads starts to become less
effective.  This can crush database performance.

flash SSD 'fixes' this problem because relative to a disk head seek
the cost of random read i/o on flash is basically zero.  however flash
has some problems writing, such that you get to choose between
volatility of data (irrespective of fsync) or lousy performance.  So
flash isn't yet a general purpose database solution, and wont be until
the write performance problem is fixed in a way that doesn't
compromise on volatility.  If/when that happens, and there isn't a
huge price premium to pay vs flash prices today, all my new servers
will be spec'd with flash :-).

merlin

Re: Solid State Drives with PG

From
Greg Smith
Date:
Merlin Moncure wrote:
> So flash isn't yet a general purpose database solution, and wont be until
> the write performance problem is fixed in a way that doesn't
> compromise on volatility.

Flash drives that ship with a supercapacitor large enough to ensure
orderly write cache flushing in the event of power loss seem to be the
only solution anyone is making progress on for this right now.  That
would turn them into something even better even than the traditional
approach of using regular disk with a battery-backed write caching
controller.  Given the relatively small write cache involved and the
fast write speed, it's certainly feasible to just flush at power loss
every time rather than what the BBWC products do--recover once power
comes back.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: Solid State Drives with PG

From
Merlin Moncure
Date:
On Fri, Mar 26, 2010 at 2:32 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Merlin Moncure wrote:
>>
>> So flash isn't yet a general purpose database solution, and wont be until
>> the write performance problem is fixed in a way that doesn't
>> compromise on volatility.
>
> Flash drives that ship with a supercapacitor large enough to ensure orderly
> write cache flushing in the event of power loss seem to be the only solution
> anyone is making progress on for this right now.  That would turn them into
> something even better even than the traditional approach of using regular
> disk with a battery-backed write caching controller.  Given the relatively
> small write cache involved and the fast write speed, it's certainly feasible
> to just flush at power loss every time rather than what the BBWC products
> do--recover once power comes back.

right -- unfortunately there is likely going to be a fairly high cost
premium on these devices for a good while yet.  right now afaik you
only see this stuff on boutique type devices...yeech.  I have to admit
until your running expose in this stuff I was led to believe by a few
companies (especially Intel) that flash storage technology was a few
years ahead of where it really was -- it's going to take me a long
time to forgive them for that!

put another way (are you listening intel?): _NO_ drive should be
positioned to the server/enterprise market that does not honor fsync
by default unless it is very clearly documented!  This is forgivable
for a company geared towards the consumer market...but Intel...ugh!

merlin

Re: Solid State Drives with PG

From
Brad Nicholson
Date:
On Fri, 2010-03-26 at 15:27 -0400, Merlin Moncure wrote:
> On Fri, Mar 26, 2010 at 2:32 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> > Merlin Moncure wrote:
> >>
> >> So flash isn't yet a general purpose database solution, and wont be until
> >> the write performance problem is fixed in a way that doesn't
> >> compromise on volatility.
> >
> > Flash drives that ship with a supercapacitor large enough to ensure orderly
> > write cache flushing in the event of power loss seem to be the only solution
> > anyone is making progress on for this right now.  That would turn them into
> > something even better even than the traditional approach of using regular
> > disk with a battery-backed write caching controller.  Given the relatively
> > small write cache involved and the fast write speed, it's certainly feasible
> > to just flush at power loss every time rather than what the BBWC products
> > do--recover once power comes back.
>
> right -- unfortunately there is likely going to be a fairly high cost
> premium on these devices for a good while yet.  right now afaik you
> only see this stuff on boutique type devices...yeech.

TMS RamSan products have more than adequate capacitor power to handle
failure cases.  They look like a very solid product.  In addition to
this, they have internal RAID across the chips to protect against chip
failure. Wear-leveling is controlled on the board instead of offloaded
to the host.  I haven't gotten my hands on one yet, but should at some
point in the not to distant future.

I'm not sure what the price point is though.  But when you factor in the
cost of the products they are competing against from a performance
perspective, I'd be surprise if they aren't a lot cheaper.  Especially
when figuring in all the other costs that go along with disk arrays -
power, cooling, rack space costs.

Depends on the your vantange point I guess.  I'm looking at these as
potential alternatives to some high end, expensive storage products, not
a cheap way to get really fast disk.
--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: Solid State Drives with PG

From
Merlin Moncure
Date:
On Fri, Mar 26, 2010 at 3:43 PM, Brad Nicholson
<bnichols@ca.afilias.info> wrote:
> I'm not sure what the price point is though.

here is a _used_ 320gb ramsan for 15k :-).  dram storage is pricey.

merlin

Re: Solid State Drives with PG

From
Vick Khera
Date:
On Fri, Mar 26, 2010 at 3:50 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> here is a _used_ 320gb ramsan for 15k :-).  dram storage is pricey.
>

I think using DRAM as the base is way better than flash.  Just use the
flash or a regular disk as the backup with a battery to power the
backup operation.

I have in my storage room a DRAM based SCSI storage device made by
Imperial Technology.  It was totally the bees knees in 2000 when I
bought it (with 1GB of RAM) for almost $30k.  Upgraded a year later to
5Gb for another $15k.  It has 4 low-profile/offset SCSI-2 connectors
and full battery backed up UPS internal to it, and writes itself to a
traditional disk drive on power outage, and continually ran self
diagnostics to ensure that everything was just right.

Free.  But it doesn't power up.  Probably needs a cap replaced or
something simple like that.

Re: Solid State Drives with PG

From
Merlin Moncure
Date:
On Wed, Apr 7, 2010 at 3:27 PM, Vick Khera <vivek@khera.org> wrote:
> On Fri, Mar 26, 2010 at 3:50 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> here is a _used_ 320gb ramsan for 15k :-).  dram storage is pricey.
>>
>
> I think using DRAM as the base is way better than flash.  Just use the
> flash or a regular disk as the backup with a battery to power the
> backup operation.
>
> I have in my storage room a DRAM based SCSI storage device made by
> Imperial Technology.  It was totally the bees knees in 2000 when I
> bought it (with 1GB of RAM) for almost $30k.  Upgraded a year later to
> 5Gb for another $15k.  It has 4 low-profile/offset SCSI-2 connectors
> and full battery backed up UPS internal to it, and writes itself to a
> traditional disk drive on power outage, and continually ran self
> diagnostics to ensure that everything was just right.
>
> Free.  But it doesn't power up.  Probably needs a cap replaced or
> something simple like that.

dram storage makes sense in some cases but is generally so expensive
that it throws off the whole hardware cost/engineering calculus even
with the insane expense of writing software (even to the 0.0001% of it
managers that understand  this).  that's saying something.

the idea behind flash storage though was to provide at least decent
performance at a reasonable cost.  making dram storage fault tolerant
takes a lot of engineering thus the high cost.  as a dba, the idea of
flash being able to be swapped in for sata spinning drives for a
10-20x gain in iops makes me vibrate.

except that the fault tolerance issue isn't worked out yet.  so I
continue to buy bulk fossilized dinosaur plop and waste precious time
figuring out how to make it work with otherwise fairly modern
equipment.   did i mention that i was annoyed with intel?

check out their faq entry on ssd/write back cache:

Does the Intel SSD have a write cache?
Yes. However data caching is limited to the controller for enhanced performance.

huh!?

merlin

Re: Solid State Drives with PG

From
Vick Khera
Date:
On Wed, Apr 7, 2010 at 4:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> except that the fault tolerance issue isn't worked out yet.

Yep. I do not want to be the guy doing the product testing to see if
they're suitable for a high-write DB load.

Re: Solid State Drives with PG

From
John R Pierce
Date:
Vick Khera wrote:
> On Wed, Apr 7, 2010 at 4:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
>> except that the fault tolerance issue isn't worked out yet.
>>
>
> Yep. I do not want to be the guy doing the product testing to see if
> they're suitable for a high-write DB load.
>
>

all the enterprise SAN guys I've talked with say the Intel x25 drives
are consumer junk, about the only thing they will use is STEC Zeus, and
even then they mirror them.   These are SAS or FC, not SATA, so write
barriers are well behaved (assuming your OS doesn't toss them like
<cough>LVM</cough>)





Re: Solid State Drives with PG

From
Gordan Bobic
Date:
John R Pierce wrote:

> all the enterprise SAN guys I've talked with say the Intel x25 drives
> are consumer junk, about the only thing they will use is STEC Zeus, and
> even then they mirror them.

A couple of points there.

1) Mirroring flash drives is a bit ill advised since flash has a rather
predictable long-term wear-out failure point. It would make more sense
to mirror with a mechanical disk and use the SSD for reads, with some
clever firmware to buffer up the extra writes to the mechanical disk and
return completed status as soon as the data has been committed to the
faster flash disk.

2) How much of that dislike of Intel is actually justified by something
other than the margins offered / procurement policy (a.k.a. buying from
the vendor that sends you the best present rather than from the vendor
that has the best product)? Intel X25-E drives have write endurance,
performance and power consumption (150mW TDP!) that are at least as good
as other enterprise grade drives. Most enterprise grade drives don't
even have trim support yet. I wouldn't knock Intel drives until you've
tried them. Also bear in mind that Intel X25-E drives have high street
prices similar to similar sized 15,000rpm mechanical drives you might
buy from a SAN vendor (of course, same drives without the re-badge can
be had for a fraction of the price).

Then again, I never did have a very high opinion of big name SAN vendor
hardware - I have always achieved better results at a fraction of the
cost with appliances I've built myself.

Gordan

Re: Solid State Drives with PG

From
Greg Smith
Date:
Gordan Bobic wrote:
> How much of that dislike of Intel is actually justified by something
> other than the margins offered / procurement policy (a.k.a. buying
> from the vendor that sends you the best present rather than from the
> vendor that has the best product)? Intel X25-E drives have write
> endurance, performance and power consumption (150mW TDP!) that are at
> least as good as other enterprise grade drives.

Please; there is nobody bashing Intel here who gives a damn about vendor
payola in any direction.  Intel's drives are not suitable for enterprise
database use because their write cache policy both fails testing and
isn't documented properly to figure out how to work around its
limitations (if that's even possible).  That' s the end of the story; if
your drive gets corrupted and you lose your database, it doesn't matter
how good any of the other things you mention are.

> Then again, I never did have a very high opinion of big name SAN
> vendor hardware - I have always achieved better results at a fraction
> of the cost with appliances I've built myself.

If you're not testing write cache durability under harsh conditions like
a power plug pull, you're not doing a fair comparison.  SAN hardware
should include good behavior under such situations, it's part of what
you're paying for, while many cheaper solutions do not.  It's
straightforward to beat the performance of a SAN, but what makes people
buy them anyway is their ruggedness under really bad failure conditions
that direct-attached storage can struggle with.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: Solid State Drives with PG

From
Yeb Havinga
Date:
Gordan Bobic wrote:
> John R Pierce wrote:
>
>> all the enterprise SAN guys I've talked with say the Intel x25 drives
>> are consumer junk, about the only thing they will use is STEC Zeus,
>> and even then they mirror them.
>
> A couple of points there.
>
> 1) Mirroring flash drives is a bit ill advised since flash has a
> rather predictable long-term wear-out failure point. It would make
> more sense to mirror with a mechanical disk and use the SSD for reads,
> with some clever firmware to buffer up the extra writes to the
> mechanical disk and return completed status as soon as the data has
> been committed to the faster flash disk.
Interesting, a few days ago I read something in the mdadm config about a
config for mirroring over 'slower' links, and was waiting for a proper
use case/excuse to go playing with it ;-) (looking up again)...

       -W, --write-mostly
              subsequent  devices  lists  in a --build, --create, or
--add command will be flagged as 'write-mostly'.  This is valid for
RAID1 only and means that
              the 'md' driver will avoid reading from these devices if
at all possible.  This can be useful if mirroring over a slow link.

regards,
Yeb Havinga