Re: Capacitors, etc., in hard drives and SSD for DBMS machines... - Mailing list pgsql-performance
From | Wes Vaske (wvaske) |
---|---|
Subject | Re: Capacitors, etc., in hard drives and SSD for DBMS machines... |
Date | |
Msg-id | 1467989418640.93960@micron.com Whole thread Raw |
In response to | Re: Capacitors, etc., in hard drives and SSD for DBMS machines... (Levente Birta <blevi.linux@gmail.com>) |
List | pgsql-performance |
> Why all this concern about how long a disk (or SSD) drive can stay up > after a power failure? When we're discussing SSD power loss protection, it's not a question of how long the drive can stay up but whether data atrest or data in flight are going to be lost/corrupted in the event of a power loss. There are a couple big reasons for this. 1. NAND write latency is actually somewhat poor. SSDs are comprised of NAND chips, DRAM for cache, and the controller. If the SSD disabled its disk cache, the write latenciesunder moderate load would move from the sub 100 microseconds range to the 1-10 milliseconds range. This is due tohow the SSD writes to NAND. A single write operation takes a fairly large amount of time but large blocks cans be writtenas a single operation. 2. Garbage Collection If you're not familiar with GC, I definitely recommend reading up as it's one of the defining characteristics of SSDs (andnow SMR HDDs). The basic principle is that SSDs don't support a modification to a page (8KB). Instead, the contents wouldneed to be erased then written. Additionally, the slice of the chip that can be read, written, or erased are not thesame size for each operation. Erase Blocks are much bigger than the page (eg: 2MB vs 8KB). This means that to modify an8KB page, the entire 2MB erase block needs to be read to the disk cache, erased, then written with the new 8KB page alongwith the rest of the existing data in the 2MB erase block. This operation needs to be power loss protected (it's the operation that the Crucial drives protect against). If it's not,then the data that is read to cache could be lost or corrupted if power is lost during the operation. The data in theerase block is not necessarily related to the page being modified and could be anywhere else in the filesystem. *IMPORTANT:This is data at rest that may have been written years prior. It is not just new data that may be lost if a GCoperation can not complete.* TL;DR: Many SSDs will not disable disk cache even if you give the command to do so. Full Power Loss Protection at the drivelevel should be a requirement for any Enterprise or Data Center application to ensure no data loss or corruption ofdata at rest. This is why there is so much concern with the internals to specific SSDs regarding behavior in a power loss event. It canhave large impacts on the reliability of the entire system. Wes Vaske | Senior Storage Solutions Engineer Micron Technology ________________________________________ From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org> on behalf of Levente Birta <blevi.linux@gmail.com> Sent: Friday, July 8, 2016 5:36 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Capacitors, etc., in hard drives and SSD for DBMS machines... On 08/07/2016 13:23, Jean-David Beyer wrote: > Why all this concern about how long a disk (or SSD) drive can stay up > after a power failure? > > It seems to me that anyone interested in maintaining an important > database would have suitable backup power on their entire systems, > including the disk drives, so they could coast over any power loss. > > I do not have any database that important, but my machine has an APC > Smart-UPS that has 2 1/2 hours of backup time with relatively new > batteries in it. It is so oversize because my previous computer used > much more power than this one does. And if my power company has a brown > out or black out of over 7 seconds, my natural gas fueled backup > generator picks up the load very quickly. > > Am I overlooking something? > UPS-es can fail too ... :) And so many things could be happen ... once I plugged out the power cord from the UPS which powered the database server (which was a production server) ... I thought powering something else :) but lucky me ... the controller was flash backed -- Levi -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
pgsql-performance by date: