Re: Reliability with RAID 10 SSD and Streaming Replication - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Reliability with RAID 10 SSD and Streaming Replication
Date
Msg-id CAHyXU0ymFT7Sqw77yPryGWKEpjV4Wv9Vkc21vGtPKo0GZWGg8A@mail.gmail.com
Whole thread Raw
In response to Re: Reliability with RAID 10 SSD and Streaming Replication  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: Reliability with RAID 10 SSD and Streaming Replication
Re: Reliability with RAID 10 SSD and Streaming Replication
Re: Reliability with RAID 10 SSD and Streaming Replication
List pgsql-performance
On Tue, May 21, 2013 at 7:19 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> On 5/20/13 6:32 PM, Merlin Moncure wrote:
>
>> When it comes to databases, particularly in the open source postgres
>> world, hard drives are completely obsolete.  SSD are a couple of
>> orders of magnitude faster and this (while still slow in computer
>> terms) is fast enough to put storage into the modern area by anyone
>> who is smart enough to connect a sata cable.
>
>
> You're skirting the edge of vendor Kool-Aid here.  I'm working on a very
> detailed benchmark vs. real world piece centered on Intel's 710 models, one
> of the few reliable drives on the market.  (Yes, I have a DC S3700 too, just
> not as much data yet)  While in theory these drives will hit two orders of
> magnitude speed improvement, and I have benchmarks where that's the case, in
> practice I've seen them deliver less than 5X better too.  You get one guess
> which I'd consider more likely to happen on a difficult database server
> workload.
>
> The only really huge gain to be had using SSD is commit rate at a low client
> count.  There you can easily do 5,000/second instead of a spinning disk that
> is closer to 100, for less than what the battery-backed RAID card along
> costs to speed up mechanical drives.  My test server's 100GB DC S3700 was
> $250.  That's still not two orders of magnitude faster though.

That's most certainly *not* the only gain to be had: random read rates
of large databases (a very important metric for data analysis) can
easily hit 20k tps.  So I'll stand by the figure. Another point: that
5000k commit raid is sustained, whereas a raid card will spectacularly
degrade until the cache overflows; it's not fair to compare burst with
sustained performance.  To hit 5000k sustained commit rate along with
good random read performance, you'd need a very expensive storage
system.   Right now I'm working (not by choice) with a teir-1 storage
system (let's just say it rhymes with 'weefax') and I would trade it
for direct attached SSD in a heartbeat.

Also, note that 3rd party benchmarking is showing the 3700 completely
smoking the 710 in database workloads (for example, see
http://www.anandtech.com/show/6433/intel-ssd-dc-s3700-200gb-review/6).

Anyways, SSD installation in the post-capactior era has been 100.0%
correlated in my experience (admittedly, around a dozen or so systems)
with removal of storage as the primary performance bottleneck, and
I'll stand by that.  I'm not claiming to work with extremely high
transaction rate systems but then again neither are most of the people
reading this list.  Disk drives are obsolete for database
installations.

merlin


pgsql-performance by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Very slow inner join query Unacceptable latency.
Next
From: Greg Smith
Date:
Subject: Re: Reliability with RAID 10 SSD and Streaming Replication