Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD - Mailing list pgsql-performance

From Scott Carey
Subject Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Date
Msg-id 72E29316-5F4C-4A01-8924-5340DF2FAE65@richrelevance.com
Whole thread Raw
In response to Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
List pgsql-performance
On Aug 10, 2010, at 9:21 AM, Greg Smith wrote:

> Scott Carey wrote:
>> Also, the amount of data at risk in a power loss varies between
>> drives.  For Intel's drives, its a small chunk of data ( < 256K).  For
>> some other drives, the cache can be over 30MB of outstanding writes.
>> For some workloads this is acceptable
>
> No, it isn't ever acceptable.  You can expect the type of data loss you
> get when a cache fails to honor write flush calls results in
> catastrophic database corruption.  It's not "I lost the last few
> seconds";

I never said it was.

> it's "the database is corrupted and won't start" after a
> crash.

Which is sometimes acceptables.   There is NO GUARANTEE that you won't lose data, ever.  An increase in the likelihood
isan acceptable tradeoff in some situations, especially when it is small.  On ANY power loss event, with or without
batterybacked caches and such, you should do a consistency check on the system proactively.  With less reliable
hardware,that task becomes much more of a burden, and is much more likely to require restoring data from somewhere. 

What is the likelihood that your RAID card fails, or that the battery that reported 'good health' only lasts 5 minutes
andyou lose data before power is restored?   What is the likelihood of human error? 
Not that far off from the likelihood of power failure in a datacenter with redundant power.  One MUST have a DR plan.
Neverassume that your perfect hardware won't fail. 

> This is why we pound on this topic on this list.  A SSD that
> fails to honor flush requests is completely worthless for anything other
> than toy databases.

Overblown.  Not every DB and use case is a financial application or business critical app.   Many are not toys at all.
Slave,read only DB's (or simply subset tablespaces) ... 

Indexes. (per application, schema)
Tables. (per application, schema)
System tables / indexes.
WAL.

Each has different reliability requirement and consequences from losing recently written data.  less than 8K can be
fatalto the WAL, or table data.   Corrupting some tablespaces is not a big deal.  Corrupting others is catastrophic.
Theproblem with the assertion that this hardware is worthless is that it implies that every user, every use case, is at
thefar end of the reliability requirement spectrum. 

Yes, that can be a critical requirement for many, perhaps most, DB's.  But there are many uses for slightly unsafe
storagesystems. 

> You can expect significant work to recover any
> portion of your data after the first unexpected power loss under heavy
> write load in this environment, during which you're down.  We do
> database corruption recovery at 2ndQuadrant; while I can't talk about
> the details of some recent incidents, I am not speaking theoretically
> when I warn about this.

I've done the single-user mode recover system tables by hand thing myself at 4AM, on a system with battery backed RAID
10,redundant power, etc.   Raid cards die, and 10TB recovery times from backup are long. 

Its a game of balancing your data loss tolerance with the likelihood of power failure.  Both of these variables are
highlyvariable, and not just with 'toy' dbs.  If you know what you are doing, you can use 'fast but not completely
safe'storage for many things safely.  Chance of loss is NEVER zero, do not assume that 'good' hardware is flawless. 

Imagine a common internet case where synchronous_commit=false is fine.  Recovery from backups is a pain (but a daily
snapshotis taken of the important tables, and weekly for easily recoverable other stuff).   If you expect one power
relatedfailure every 2 years, it might be perfectly reasonable to use 'unsafe' SSD's in order to support high
transactionload on the risk that that once every 2 year downtime is 12 hours long instead of 30 minutes, and includes
losingup to a day's information.   Applications like this exist all over the place. 


> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> greg@2ndQuadrant.com   www.2ndQuadrant.us
>


pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Testing Sandforce SSD
Next
From: Scott Carey
Date:
Subject: Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD