Re: Capacitors, etc., in hard drives and SSD for DBMS machines... - Mailing list pgsql-performance

From Wes Vaske (wvaske)
Subject Re: Capacitors, etc., in hard drives and SSD for DBMS machines...
Date
Msg-id 1467989418640.93960@micron.com
Whole thread Raw
In response to Re: Capacitors, etc., in hard drives and SSD for DBMS machines...  (Levente Birta <blevi.linux@gmail.com>)
List pgsql-performance
> Why all this concern about how long a disk (or SSD) drive can stay up
> after a power failure?

When we're discussing SSD power loss protection, it's not a question of how long the drive can stay up but whether data
atrest or data in flight are going to be lost/corrupted in the event of a power loss. 

There are a couple big reasons for this.

1. NAND write latency is actually somewhat poor.

SSDs are comprised of NAND chips, DRAM for cache, and the controller. If the SSD disabled its disk cache, the write
latenciesunder moderate load would move from the sub 100 microseconds range to the 1-10 milliseconds range. This is due
tohow the SSD writes to NAND. A single write operation takes a fairly large amount of time but large blocks cans be
writtenas a single operation.  


2. Garbage Collection

If you're not familiar with GC, I definitely recommend reading up as it's one of the defining characteristics of SSDs
(andnow SMR HDDs). The basic principle is that SSDs don't support a modification to a page (8KB). Instead, the contents
wouldneed to be erased then written. Additionally, the slice of the chip that can be read, written, or erased are not
thesame size for each operation. Erase Blocks are much bigger than the page (eg: 2MB vs 8KB). This means that to modify
an8KB page, the entire 2MB erase block needs to be read to the disk cache, erased, then written with the new 8KB page
alongwith the rest of the existing data in the 2MB erase block. 

This operation needs to be power loss protected (it's the operation that the Crucial drives protect against). If it's
not,then the data that is read to cache could be lost or corrupted if power is lost during the operation. The data in
theerase block is not necessarily related to the page being modified and could be anywhere else in the filesystem.
*IMPORTANT:This is data at rest that may have been written years prior. It is not just new data that may be lost if a
GCoperation can not complete.* 


TL;DR: Many SSDs will not disable disk cache even if you give the command to do so. Full Power Loss Protection at the
drivelevel should be a requirement for any Enterprise or Data Center application to ensure no data loss or corruption
ofdata at rest. 


This is why there is so much concern with the internals to specific SSDs regarding behavior in a power loss event. It
canhave large impacts on the reliability of the entire system. 


Wes Vaske | Senior Storage Solutions Engineer
Micron Technology

________________________________________
From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org> on behalf of Levente Birta
<blevi.linux@gmail.com>
Sent: Friday, July 8, 2016 5:36 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Capacitors, etc., in hard drives and SSD for DBMS machines...

On 08/07/2016 13:23, Jean-David Beyer wrote:
> Why all this concern about how long a disk (or SSD) drive can stay up
> after a power failure?
>
> It seems to me that anyone interested in maintaining an important
> database would have suitable backup power on their entire systems,
> including the disk drives, so they could coast over any power loss.
>
> I do not have any database that important, but my machine has an APC
> Smart-UPS that has 2 1/2 hours of backup time with relatively new
> batteries in it. It is so oversize because my previous computer used
> much more power than this one does. And if my power company has a brown
> out or black out of over 7 seconds, my natural gas fueled backup
> generator picks up the load very quickly.
>
> Am I overlooking something?
>

UPS-es can fail too ... :)

And so many things could be happen ... once I plugged out the power cord
from the UPS which powered the database server (which was a production
server) ... I thought powering something else :)
but lucky me ... the controller was flash backed



--
            Levi


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


pgsql-performance by date:

Previous
From: vincent
Date:
Subject: Re: Capacitors, etc., in hard drives and SSD for DBMS machines...
Next
From: Jean-David Beyer
Date:
Subject: Re: Capacitors, etc., in hard drives and SSD for DBMS machines...