Thread: Interesting read on SCM upending software and hardware architecture

Interesting read on SCM upending software and hardware architecture

From
Jim Nasby
Date:
https://queue.acm.org/detail.cfm?id=2874238 discusses how modern Storage 
Class Memory (SCM), such as PCIe SSD and NVDIMMs are completely upending 
every assumption made about storage. To put this in perspective, you can 
now see storage latency that is practically on-par with things like lock 
acquisition[1].

Presumably the bulk of this difference should be handled by the OS, but 
there's probably things we should be considering too:

Tiered storage will become common. That's going to make avoiding things 
like bulk scans even more important. There's a tie-in to partitioning 
and indexes too.

The days of a SAN may be over. With memory, network and storage latency 
approaching parity it's not practical to concentrate any of these 
resources; that creates a bottleneck. This means people will be even 
more resistant to the idea of a single database server.

The cost of temporary data becomes much lower. At some point it probably 
makes sense to just mmap what's needed and move on.

Fortunately, I think our traditional reliance on OS caching has helped 
us... to some degree we've always treated storage as fast because a lot 
of requests would be coming from RAM anyway.

[1] Ok, 25x isn't exactly on-par, but considering this used to be 
25,000x... "At 100K IOPS for a uniform random workload, a CPU has 
approximately 10 microseconds to process an I/O request. Because today's 
SCMs are often considerably faster at processing sequential or read-only 
workloads, this can drop to closer to 2.5 microseconds on commodity 
hardware. Even worse, since these requests usually originate from a 
remote source, network devices have to be serviced at the same rate, 
further reducing the available per-request processing time. To put these 
numbers in context, acquiring a single uncontested lock on today's 
systems takes approximately 20ns, while a non-blocking cache 
invalidation can cost up to 100ns, only 25x less than an I/O operation."
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Interesting read on SCM upending software and hardware architecture

From
Bruce Momjian
Date:
On Thu, Jan  7, 2016 at 02:30:06PM -0600, Jim Nasby wrote:
> https://queue.acm.org/detail.cfm?id=2874238 discusses how modern
> Storage Class Memory (SCM), such as PCIe SSD and NVDIMMs are
> completely upending every assumption made about storage. To put this
> in perspective, you can now see storage latency that is practically
> on-par with things like lock acquisition[1].

How is this different from Fusion I/O devices, which have been around
for years?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +



Re: Interesting read on SCM upending software and hardware architecture

From
David Fetter
Date:
On Sun, Jan 17, 2016 at 11:13:33PM -0500, Bruce Momjian wrote:
> On Thu, Jan  7, 2016 at 02:30:06PM -0600, Jim Nasby wrote:
> > https://queue.acm.org/detail.cfm?id=2874238 discusses how modern
> > Storage Class Memory (SCM), such as PCIe SSD and NVDIMMs are
> > completely upending every assumption made about storage. To put
> > this in perspective, you can now see storage latency that is
> > practically on-par with things like lock acquisition[1].
> 
> How is this different from Fusion I/O devices, which have been
> around for years?

Price.

As these things come down in price, it'll start being more and more
reasonable to treat rotating media as exotic.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Interesting read on SCM upending software and hardware architecture

From
Robert Haas
Date:
On Mon, Jan 18, 2016 at 1:44 AM, David Fetter <david@fetter.org> wrote:
> On Sun, Jan 17, 2016 at 11:13:33PM -0500, Bruce Momjian wrote:
>> On Thu, Jan  7, 2016 at 02:30:06PM -0600, Jim Nasby wrote:
>> > https://queue.acm.org/detail.cfm?id=2874238 discusses how modern
>> > Storage Class Memory (SCM), such as PCIe SSD and NVDIMMs are
>> > completely upending every assumption made about storage. To put
>> > this in perspective, you can now see storage latency that is
>> > practically on-par with things like lock acquisition[1].
>>
>> How is this different from Fusion I/O devices, which have been
>> around for years?
>
> Price.
>
> As these things come down in price, it'll start being more and more
> reasonable to treat rotating media as exotic.

<rant>People keep predicting the death of spinning media, but I think
it's not happening to anywhere near as fast as that people think.
Yes, I'm writing this on a laptop with an SSD, and my personal laptop
also has an SSD, but their immediate predecessors did not, and these
are fairly expensive laptops.  And most customers I talk to are still
using spinning disks.  Meanwhile, main memory is getting so large that
even pretty significant databases can be entirely RAM-cached.  So I
tend to think that this is a lot less exciting than people who are not
me seem to think.</rant>

Now that having been said, I will not complain if vast quantities of
low-latency, high-bandwidth, non-volatile storage become available at
bargain prices.  And it very well may - eventually.  But I'm not quite
ready to break out the ticker tape just yet.  I think it will be a
while.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Interesting read on SCM upending software and hardware architecture

From
Peter Geoghegan
Date:
On Mon, Jan 18, 2016 at 12:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> <rant>People keep predicting the death of spinning media, but I think
> it's not happening to anywhere near as fast as that people think.
> Yes, I'm writing this on a laptop with an SSD, and my personal laptop
> also has an SSD, but their immediate predecessors did not, and these
> are fairly expensive laptops.  And most customers I talk to are still
> using spinning disks.  Meanwhile, main memory is getting so large that
> even pretty significant databases can be entirely RAM-cached.  So I
> tend to think that this is a lot less exciting than people who are not
> me seem to think.</rant>

I tend to agree that the case for SSDs as a revolutionary technology
has been significantly overstated. This recent article makes some
interesting points:

http://www.zdnet.com/article/what-we-learned-about-ssds-in-2015/

I think it's much more true that main memory scaling (in particular,
main memory capacity) has had a huge impact, but that trend appears to
now be stalling.

-- 
Peter Geoghegan



Re: Interesting read on SCM upending software and hardware architecture

From
Jim Nasby
Date:
On 1/18/16 2:47 PM, Peter Geoghegan wrote:
> On Mon, Jan 18, 2016 at 12:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> <rant>People keep predicting the death of spinning media, but I think
>> it's not happening to anywhere near as fast as that people think.
>> Yes, I'm writing this on a laptop with an SSD, and my personal laptop
>> also has an SSD, but their immediate predecessors did not, and these
>> are fairly expensive laptops.  And most customers I talk to are still
>> using spinning disks.  Meanwhile, main memory is getting so large that
>> even pretty significant databases can be entirely RAM-cached.  So I
>> tend to think that this is a lot less exciting than people who are not
>> me seem to think.</rant>
>
> I tend to agree that the case for SSDs as a revolutionary technology
> has been significantly overstated. This recent article makes some
> interesting points:
>
> http://www.zdnet.com/article/what-we-learned-about-ssds-in-2015/
>
> I think it's much more true that main memory scaling (in particular,
> main memory capacity) has had a huge impact, but that trend appears to
> now be stalling.

My original article doesn't talk about SSDs; it's talking about 
non-volatile memory architectures (quoted extract below). Fusion IO is 
an example of this, and if NVDIMMs become available we'll see even 
faster non-volatile performance.

To me, the most interesting point the article makes is that systems now 
need much better support for multiple classes of NV storage. I agree 
with your point that spinning rust is here to stay for a long time, 
simply because it's cheap as heck. So systems need to become much better 
at moving data between different layers of NV storage so that you're 
getting the biggest bang for the buck. That will remain critical as long 
as SCM's remain 25x more expensive than rust.

Quote from article:



Flash-based storage devices are not new: SAS and SATA SSDs have been 
available for at least the past decade, and have brought flash memory 
into computers in the same form factor as spinning disks. SCMs reflect a 
maturing of these flash devices into a new, first-class I/O device: SCMs 
move flash off the slow SAS and SATA buses historically used by disks, 
and onto the significantly faster PCIe bus used by more 
performance-sensitive devices such as network interfaces and GPUs. 
Further, emerging SCMs, such as non-volatile DIMMs (NVDIMMs), interface 
with the CPU as if they were DRAM and offer even higher levels of 
performance for non-volatile storage.

Today's PCIe-based SCMs represent an astounding three-order-of-magnitude 
performance change relative to spinning disks (~100K I/O operations per 
second versus ~100). For computer scientists, it is rare that the 
performance assumptions that we make about an underlying hardware 
component change by 1,000x or more. This change is punctuated by the 
fact that the performance and capacity of non-volatile memories continue 
to outstrip CPUs in year-on-year performance improvements, closing and 
potentially even inverting the I/O gap.

The performance of SCMs means that systems must no longer "hide" them 
via caching and data reduction in order to achieve high throughput. 
Unfortunately, however, this increased performance comes at a high 
price: SCMs cost 25x as much as traditional spinning disks ($1.50/GB 
versus $0.06/GB), with enterprise-class PCIe flash devices costing 
between three and five thousand dollars each. This means that the cost 
of the non-volatile storage can easily outweigh that of the CPUs, DRAM, 
and the rest of the server system that they are installed in. The 
implication of this shift is significant: non-volatile memory is in the 
process of replacing the CPU as the economic center of the datacenter.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Interesting read on SCM upending software and hardware architecture

From
Tomasz Rybak
Date:
W dniu 18.01.2016, pon o godzinie 18∶55 -0600, użytkownik Jim Nasby
napisał:
[ cut ]
> 
> My original article doesn't talk about SSDs; it's talking about 
> non-volatile memory architectures (quoted extract below). Fusion IO
> is 
> an example of this, and if NVDIMMs become available we'll see even 
> faster non-volatile performance.
>
> To me, the most interesting point the article makes is that systems
> now 
> need much better support for multiple classes of NV storage. I agree 
> with your point that spinning rust is here to stay for a long time, 
> simply because it's cheap as heck. So systems need to become much
> better 
> at moving data between different layers of NV storage so that you're 
> getting the biggest bang for the buck. That will remain critical as
> long 
> as SCM's remain 25x more expensive than rust.
>

I guess PostgreSQL is getting ready for such a world.
Parallel sequential scan, while not useful for spinning drives,
should shine with hardware describe in that article.

Add some tuning of effective_io_concurrency and we might
have some gains even without new storage layer.
Of course ability to change storage subsystem might
help with experimentation, but even now (OK, when 9.6 is out)
we might use increased IO concurrency.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak