Re: New server: SSD/RAID recommendations? - Mailing list pgsql-performance

From Karl Denninger
Subject Re: New server: SSD/RAID recommendations?
Date
Msg-id 559BC865.9080804@denninger.net
Whole thread Raw
In response to Re: New server: SSD/RAID recommendations?  ("Graeme B. Bell" <graeme.bell@nibio.no>)
Responses Re: New server: SSD/RAID recommendations?  ("Graeme B. Bell" <graeme.bell@nibio.no>)
List pgsql-performance
On 7/7/2015 06:52, Graeme B. Bell wrote:
Hi Karl,

Great post, thanks. 

Though I don't think it's against conventional wisdom to aggregate writes into larger blocks rather than rely on 4k performance on ssds :-) 

128kb blocks + compression certainly makes sense. But it might make less sense I suppose if you had some incredibly high rate of churn in your rows. 
But for the work we do here, we could use 16MB blocks for all the difference it would make. (Tip to others: don't do that. 128kb block performance is already enough out the IO bus to most ssds)

Do you have your WAL log on a compressed zfs fs? 

Graeme Bell
Yes.

Data goes on one mirrored set of vdevs, pg_xlog goes on a second, separate pool.  WAL goes on a third pool on RaidZ2.  WAL typically goes on rotating storage since I use it (and a basebackup) as disaster recovery (and in hot spare apps the source for the syncing hot standbys) and that's nearly a big-block-write-only data stream.  Rotating media is fine for that in most applications.  I take a new basebackup on reasonable intervals and rotate the WAL logs to keep that from growing without boundary.

I use LSI host adapters for the drives themselves (no hardware RAID); I'm currently running on FreeBSD 10.1.  Be aware that ZFS on FreeBSD has some fairly nasty issues that I developed (and publish) a patch for; without it some workloads can result in very undesirable behavior where working set gets paged out in favor of ZFS ARC; if that happens your performance will go straight into the toilet.

Back before FreeBSD 9 when ZFS was simply not stable enough for me I used ARECA hardware RAID adapters and rotating media with BBUs and large cache memory installed on them with UFS filesystems.  Hardware adapters are, however, a net lose in a ZFS environment even when they nominally work well (and they frequently interact very badly with ZFS during certain operations making them just flat-out unsuitable.)  All-in I far prefer ZFS on a host adapter to UFS on a RAID adapter both from a data integrity and performance standpoint.

My SSD drives of choice are all Intel; for lower-end requirements the 730s work very well; the S3500 is next and if your write volume is high enough the S3700 has much greater endurance (but at a correspondingly higher price.)  All three are properly power-fail protected.  All three are much, much faster than rotating storage.  If you can saturate the SATA channels and need still more I/O throughput NVMe drives are the next quantum up in performance; I'm not there with our application at the present time.

Incidentally while there are people who have questioned the 730 series power loss protection I've tested it with plug-pulls and in addition it watchdogs its internal power loss capacitors -- from the smartctl -a display of one of them on an in-service machine here:

175 Power_Loss_Cap_Test     0x0033   100   100   010    Pre-fail  Always       -       643 (4 6868)


--
Karl Denninger
karl@denninger.net
The Market Ticker
[S/MIME encrypted email preferred]
Attachment

pgsql-performance by date:

Previous
From: "Graeme B. Bell"
Date:
Subject: Re: New server: SSD/RAID recommendations?
Next
From: "Graeme B. Bell"
Date:
Subject: Re: New server: SSD/RAID recommendations?