Thread: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Matt Brock
Date:
Hello.

We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware
RAID.

My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6 on
aserver with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems unlikely)?
Shouldwe consider alternative configurations as being more effective for getting better use out of the hardware? 

The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use of
thishardware and software? 

The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 server with P420i RAID controller, and two
MLCSSDs in RAID 1 for the OS, and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to have used
thisspecific hardware setup in order to have experience with these general issues. The P420i controller appears to be
compatiblewith recent versions of CentOS, so drivers should not be a concern (hopefully). 

Any insights anyone can offer on these issues would be most welcome.

Regards,

Matt.

Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Fri, May 10, 2013 at 9:14 AM, Matt Brock <mb@mattbrock.co.uk> wrote:
> Hello.
>
> We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware
RAID.
>
> My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6
ona server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems
unlikely)?Should we consider alternative configurations as being more effective for getting better use out of the
hardware?
>
> The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use
ofthis hardware and software? 
>
> The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 server with P420i RAID controller, and two
MLCSSDs in RAID 1 for the OS, and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to have used
thisspecific hardware setup in order to have experience with these general issues. The P420i controller appears to be
compatiblewith recent versions of CentOS, so drivers should not be a concern (hopefully). 

The specific drive models play a huge impact on SSD performance.  In
fact, the point you are using SLC drives suggests you might be using
antiquated (by SSD standards) hardware.   All the latest action is on
MLC now (see here:
http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-dc-s3700-series.html).

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Matt Brock
Date:
After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and write
performancein the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that SLCs
arestill a better bet for databases than MLC? 

The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs isn't
verywide, so cost isn't really an issue. We just want to use whichever is better for the database. 

On 10 May 2013, at 15:20, Merlin Moncure <mmoncure@gmail.com> wrote:

> On Fri, May 10, 2013 at 9:14 AM, Matt Brock <mb@mattbrock.co.uk> wrote:
>> Hello.
>>
>> We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware
RAID.
>>
>> My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6
ona server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems
unlikely)?Should we consider alternative configurations as being more effective for getting better use out of the
hardware?
>>
>> The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use
ofthis hardware and software? 
>>
>> The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 server with P420i RAID controller, and two
MLCSSDs in RAID 1 for the OS, and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to have used
thisspecific hardware setup in order to have experience with these general issues. The P420i controller appears to be
compatiblewith recent versions of CentOS, so drivers should not be a concern (hopefully). 
>
> The specific drive models play a huge impact on SSD performance.  In
> fact, the point you are using SLC drives suggests you might be using
> antiquated (by SSD standards) hardware.   All the latest action is on
> MLC now (see here:
> http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-dc-s3700-series.html).
>
> merlin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/10/2013 9:19 AM, Matt Brock wrote:
> After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and
writeperformance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that
SLCsare still a better bet for databases than MLC? 

I've never looked at SLC drives in the past few years and don't know
anyone who uses them these days.

>
> The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs
isn'tvery wide, so cost isn't really an issue. We just want to use whichever is better for the database. 
>
>

Could you post some specific drive models please ? HP probably doesn't
make the drives, and it really helps to know what devices you're using
since they are not nearly as generic in behavior and features as
magnetic drives.








Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
"Evan D. Hoffman"
Date:
Not sure of your space requirements, but I'd think a RAID 10 of 8x or more Samsung 840 Pro 256/512 GB would be the best value.  Using a simple mirror won't get you the reliability that you want since heavy writing will burn the drives out over time, and if you're writing the exact same content to both drives, they could likely fail at the same time.  Regardless of the underlying hardware you should still follow best practices for provisioning disks, and raid 10 is the way to go.  I don't know what your budget is though.  Anyway, mirrored SSD will probably work fine, but I'd avoid using just two drives for the reasons above.  I'd suggest at least testing RAID 5 or something else to spread the load around.  Personally, I think the ideal configuration would be a RAID 10 of at least 8 disks plus 1 hot spare.  The Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg.  YMMV but they are amazing drives.


On Fri, May 10, 2013 at 11:25 AM, David Boreham <david_list@boreham.org> wrote:
On 5/10/2013 9:19 AM, Matt Brock wrote:
After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and write performance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that SLCs are still a better bet for databases than MLC?

I've never looked at SLC drives in the past few years and don't know anyone who uses them these days.



The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs isn't very wide, so cost isn't really an issue. We just want to use whichever is better for the database.



Could you post some specific drive models please ? HP probably doesn't make the drives, and it really helps to know what devices you're using since they are not nearly as generic in behavior and features as magnetic drives.









--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Fri, May 10, 2013 at 10:19 AM, Matt Brock <mb@mattbrock.co.uk> wrote:
> After googling this for a while, it seems that High Endurance MLC is only starting to rival SLC for endurance and
writeperformance in the very latest, cutting-edge hardware. In general, though, it seems it would be fair to say that
SLCsare still a better bet for databases than MLC? 
>
> The number and capacity of drives is small in this instance, and the price difference between the two for HP SSDs
isn'tvery wide, so cost isn't really an issue. We just want to use whichever is better for the database. 

Well, it's more complicated than that.  While SLC drives were indeed
inherently faster and had longer lifespans, all flash drives basically
have the requirement of having to carefully manages writes in order to
get good performance.   Unfortunately, this means that for database
use the drives must have some type of non-volatile cache and/or
sufficient back up juice in a capacitor to spin out pending write in
the event of sudden loss of power.   Many drives, including (famously)
the so-called Intel X25-E "enterprise" lines, did not do this and
where therefore unsuitable for database use.

As it turns out the list of flash drives are suitable for database use
is surprisingly small.  The s3700 I noted upthread seems to be
specifically built with databases in mind and is likely the best
choice for new deployments.  The older Intel 320 is also a good
choice.  I think that's pretty much it until you get into expensive
pci-e based gear.   There might be some non-intel drives out there
that are suitable but be very very careful and triple verify that the
drive has on-board capacitor and has gotten real traction in
enterprise database usage.

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman
<evandhoffman@gmail.com> wrote:
> Not sure of your space requirements, but I'd think a RAID 10 of 8x or more
> Samsung 840 Pro 256/512 GB would be the best value.  Using a simple mirror
> won't get you the reliability that you want since heavy writing will burn
> the drives out over time, and if you're writing the exact same content to
> both drives, they could likely fail at the same time.  Regardless of the
> underlying hardware you should still follow best practices for provisioning
> disks, and raid 10 is the way to go.  I don't know what your budget is
> though.  Anyway, mirrored SSD will probably work fine, but I'd avoid using
> just two drives for the reasons above.  I'd suggest at least testing RAID 5
> or something else to spread the load around.  Personally, I think the ideal
> configuration would be a RAID 10 of at least 8 disks plus 1 hot spare.  The
> Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg.  YMMV but they
> are amazing drives.

Samsung 840 has no power loss protection and is therefore useless for
database use IMO unless you don't care about data safety and/or are
implementing redundancy via some other method (say, by synchronous
replication).

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
"Evan D. Hoffman"
Date:
I'd expect to use a RAID controller with either BBU or NVRAM cache to handle that, and that the server itself would be on UPS for a production DB.  That said, a standby replica DB on conventional disk is definitely a good idea in any case.


On Fri, May 10, 2013 at 12:25 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman
<evandhoffman@gmail.com> wrote:
> Not sure of your space requirements, but I'd think a RAID 10 of 8x or more
> Samsung 840 Pro 256/512 GB would be the best value.  Using a simple mirror
> won't get you the reliability that you want since heavy writing will burn
> the drives out over time, and if you're writing the exact same content to
> both drives, they could likely fail at the same time.  Regardless of the
> underlying hardware you should still follow best practices for provisioning
> disks, and raid 10 is the way to go.  I don't know what your budget is
> though.  Anyway, mirrored SSD will probably work fine, but I'd avoid using
> just two drives for the reasons above.  I'd suggest at least testing RAID 5
> or something else to spread the load around.  Personally, I think the ideal
> configuration would be a RAID 10 of at least 8 disks plus 1 hot spare.  The
> Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg.  YMMV but they
> are amazing drives.

Samsung 840 has no power loss protection and is therefore useless for
database use IMO unless you don't care about data safety and/or are
implementing redundancy via some other method (say, by synchronous
replication).

merlin

Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Fri, May 10, 2013 at 11:34 AM, Evan D. Hoffman
<evandhoffman@gmail.com> wrote:
> I'd expect to use a RAID controller with either BBU or NVRAM cache to handle
> that, and that the server itself would be on UPS for a production DB.  That
> said, a standby replica DB on conventional disk is definitely a good idea in
> any case.

Sadly, NVRAM cache doesn't help (unless the raid controller is
managing drive writes down to the flash level and no such products
exist that I am aware of).  The problem is that provide guarantees the
raid controller still needs to be able to tell the device to flush
down to physical storage.  While flash drives can be configured to do
that (basically write-through mode), it's pretty silly to do so as it
will ruin performance and quickly destroy the drive.

Trusting UPS is up to you, but if your ups does, someone knocks the
power cable, etc you have data loss.  With on-drive capacitor you only
get data loss via physical damage or corruption on the drive.

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/10/2013 10:21 AM, Merlin Moncure wrote:
> As it turns out the list of flash drives are suitable for database use
> is surprisingly small. The s3700 I noted upthread seems to be
> specifically built with databases in mind and is likely the best
> choice for new deployments. The older Intel 320 is also a good choice.
> I think that's pretty much it until you get into expensive pci-e based
> gear.

This may have been a typo : did you mean Intel 710 series rather than 320 ?

While the 320 has the supercap, it isn't specified for high write endurance.
Definitely usable for a database, and a better choice than most of the
alternatives, but I'd have listed the 710 ahead of the 320.




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Fri, May 10, 2013 at 12:03 PM, David Boreham <david_list@boreham.org> wrote:
> On 5/10/2013 10:21 AM, Merlin Moncure wrote:
>>
>> As it turns out the list of flash drives are suitable for database use is
>> surprisingly small. The s3700 I noted upthread seems to be specifically
>> built with databases in mind and is likely the best choice for new
>> deployments. The older Intel 320 is also a good choice. I think that's
>> pretty much it until you get into expensive pci-e based gear.
>
>
> This may have been a typo : did you mean Intel 710 series rather than 320 ?
>
> While the 320 has the supercap, it isn't specified for high write endurance.
> Definitely usable for a database, and a better choice than most of the
> alternatives, but I'd have listed the 710 ahead of the 320.

It wasn't a typo.  The 320 though is perfectly fine although it will
wear out faster -- so it fills a niche for low write intensity
applications.  I find the s3700 to be superior to the 710 in just
about every way (although you're right -- it is suitable for database
use).

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Steve Clark
Date:
On 05/10/2013 12:46 PM, Merlin Moncure wrote:
> On Fri, May 10, 2013 at 11:34 AM, Evan D. Hoffman
> <evandhoffman@gmail.com> wrote:
>> I'd expect to use a RAID controller with either BBU or NVRAM cache to handle
>> that, and that the server itself would be on UPS for a production DB.  That
>> said, a standby replica DB on conventional disk is definitely a good idea in
>> any case.
> Sadly, NVRAM cache doesn't help (unless the raid controller is
> managing drive writes down to the flash level and no such products
> exist that I am aware of).  The problem is that provide guarantees the
> raid controller still needs to be able to tell the device to flush
> down to physical storage.  While flash drives can be configured to do
> that (basically write-through mode), it's pretty silly to do so as it
> will ruin performance and quickly destroy the drive.
>
> Trusting UPS is up to you, but if your ups does, someone knocks the
> power cable, etc you have data loss.  With on-drive capacitor you only
> get data loss via physical damage or corruption on the drive.
>
> merlin
>
Well we have dual redundant power supplies on separate UPS so could something go wrong yes, but a tornado could
come along and destroy the building also.


--
Stephen Clark
*NetWolves*
Director of Technology
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Lonni J Friedman
Date:
On Fri, May 10, 2013 at 10:20 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, May 10, 2013 at 12:03 PM, David Boreham <david_list@boreham.org> wrote:
>> On 5/10/2013 10:21 AM, Merlin Moncure wrote:
>>>
>>> As it turns out the list of flash drives are suitable for database use is
>>> surprisingly small. The s3700 I noted upthread seems to be specifically
>>> built with databases in mind and is likely the best choice for new
>>> deployments. The older Intel 320 is also a good choice. I think that's
>>> pretty much it until you get into expensive pci-e based gear.
>>
>>
>> This may have been a typo : did you mean Intel 710 series rather than 320 ?
>>
>> While the 320 has the supercap, it isn't specified for high write endurance.
>> Definitely usable for a database, and a better choice than most of the
>> alternatives, but I'd have listed the 710 ahead of the 320.
>
> It wasn't a typo.  The 320 though is perfectly fine although it will
> wear out faster -- so it fills a niche for low write intensity
> applications.  I find the s3700 to be superior to the 710 in just
> about every way (although you're right -- it is suitable for database
> use).

There's also the 520 series, which has better performance than the 320
series (which is EOL now).


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/10/2013 11:20 AM, Merlin Moncure wrote:
> I find the s3700 to be superior to the 710 in just about every way
> (although you're right -- it is suitable for database use). merlin

The s3700 series replaces the 710 so it should be superior :)




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/10/2013 11:23 AM, Lonni J Friedman wrote:
> There's also the 520 series, which has better performance than the 320
> series (which is EOL now).

I wouldn't use the 520 series for production database storage -- it has
the Sandforce controller and apparently no power failure protection.




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Alvaro Herrera
Date:
Steve Clark escribió:

> Well we have dual redundant power supplies on separate UPS so could something go wrong yes, but a tornado could
> come along and destroy the building also.

.. hence your standby server across the country?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Steven Schlansker
Date:
On May 10, 2013, at 7:14 AM, Matt Brock <mb@mattbrock.co.uk> wrote:

> Hello.
>
> We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware
RAID.
>
> My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6
ona server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems
unlikely)?Should we consider alternative configurations as being more effective for getting better use out of the
hardware?
>
> The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use
ofthis hardware and software? 
>

A couple of things I noticed with a similar-ish setup:

* Some forms of RAID / LVM break the kernel's automatic disk tuning mechanism.  In particular, there is a "rotational"
tunablethat often does not get set right.  You might end up tweaking read ahead and friends as well. 
http://www.mjmwired.net/kernel/Documentation/block/queue-sysfs.txt#112

* The default Postgres configuration is awful for a SSD backed database.  You really need to futz with checkpoints to
getacceptable throughput. 
The "PostgreSQL 9.0 High Performance" book is fantastic and is what I used to great success.

* The default Linux virtual memory configuration is awful for this configuration.  Briefly, it will accept a ton of
incomingdata, and then go through an awful stall as soon as it calls fsync() to write all that data to disk.  We had
multi-seconddelays all the way through to the application because of this.  We had to change the zone_reclaim_mode and
thedirty buffer limits. 
http://www.postgresql.org/message-id/500616CB.3070408@2ndQuadrant.com



I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from tuning
a4xMLC SSD RAID-10.  I haven't proven that this is optimal, but it was way better than the defaults.  We ended up with
thefollowing list of changes: 

* Change IO scheduler to "noop"
* Mount DB volume with nobarrier, noatime
* Turn blockdev readahead to 16MiB
* Turn sdb's "rotational" tuneable to 0

PostgreSQL configuration changes:
synchronous_commit = off
effective_io_concurrency = 4
checkpoint_segments = 1024
checkpoint_timeout = 10min
checkpoint_warning = 8min
shared_buffers = 32gb
temp_buffers = 128mb
work_mem = 512mb
maintenance_work_mem = 1gb

Linux sysctls:
vm.swappiness = 0
vm.zone_reclaim_mode = 0
vm.dirty_bytes = 134217728
vm.dirty_background_bytes = 1048576

Hope that helps,
Steven



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Lonni J Friedman
Date:
On Fri, May 10, 2013 at 11:23 AM, Steven Schlansker <steven@likeness.com> wrote:
>
> On May 10, 2013, at 7:14 AM, Matt Brock <mb@mattbrock.co.uk> wrote:
>
>> Hello.
>>
>> We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware
RAID.
>>
>> My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6
ona server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems
unlikely)?Should we consider alternative configurations as being more effective for getting better use out of the
hardware?
>>
>> The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use
ofthis hardware and software? 
>>
>
> A couple of things I noticed with a similar-ish setup:
>
> * Some forms of RAID / LVM break the kernel's automatic disk tuning mechanism.  In particular, there is a
"rotational"tunable that often does not get set right.  You might end up tweaking read ahead and friends as well. 
> http://www.mjmwired.net/kernel/Documentation/block/queue-sysfs.txt#112
>
> * The default Postgres configuration is awful for a SSD backed database.  You really need to futz with checkpoints to
getacceptable throughput. 
> The "PostgreSQL 9.0 High Performance" book is fantastic and is what I used to great success.
>
> * The default Linux virtual memory configuration is awful for this configuration.  Briefly, it will accept a ton of
incomingdata, and then go through an awful stall as soon as it calls fsync() to write all that data to disk.  We had
multi-seconddelays all the way through to the application because of this.  We had to change the zone_reclaim_mode and
thedirty buffer limits. 
> http://www.postgresql.org/message-id/500616CB.3070408@2ndQuadrant.com
>
>
>
> I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from
tuninga 4xMLC SSD RAID-10.  I haven't proven that this is optimal, but it was way better than the defaults.  We ended
upwith the following list of changes: 
>
> * Change IO scheduler to "noop"
> * Mount DB volume with nobarrier, noatime
> * Turn blockdev readahead to 16MiB
> * Turn sdb's "rotational" tuneable to 0
>
> PostgreSQL configuration changes:
> synchronous_commit = off
> effective_io_concurrency = 4
> checkpoint_segments = 1024
> checkpoint_timeout = 10min
> checkpoint_warning = 8min
> shared_buffers = 32gb
> temp_buffers = 128mb
> work_mem = 512mb
> maintenance_work_mem = 1gb
>
> Linux sysctls:
> vm.swappiness = 0
> vm.zone_reclaim_mode = 0
> vm.dirty_bytes = 134217728
> vm.dirty_background_bytes = 1048576

Can you provide more details about your setup, including:
* What kind of filesystem are you using?
* Linux distro and/or kernel version
* hardware (RAM, CPU cores etc)
* database usage patterns (% writes, growth, etc)

thanks


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Fri, May 10, 2013 at 1:23 PM, Steven Schlansker <steven@likeness.com> wrote:
>
> On May 10, 2013, at 7:14 AM, Matt Brock <mb@mattbrock.co.uk> wrote:
>
>> Hello.
>>
>> We're intending to deploy PostgreSQL on Linux with SSD drives which would be in a RAID 1 configuration with Hardware
RAID.
>>
>> My first question is essentially: are there any issues we need to be aware of when running PostgreSQL 9 on CentOS 6
ona server with SSD drives in a Hardware RAID 1 configuration? Will there be any compatibility problems (seems
unlikely)?Should we consider alternative configurations as being more effective for getting better use out of the
hardware?
>>
>> The second question is: are there any SSD-specific issues to be aware of when tuning PostgreSQL to make the best use
ofthis hardware and software? 
>>
>
> A couple of things I noticed with a similar-ish setup:
>
> * Some forms of RAID / LVM break the kernel's automatic disk tuning mechanism.  In particular, there is a
"rotational"tunable that often does not get set right.  You might end up tweaking read ahead and friends as well. 
> http://www.mjmwired.net/kernel/Documentation/block/queue-sysfs.txt#112
>
> * The default Postgres configuration is awful for a SSD backed database.  You really need to futz with checkpoints to
getacceptable throughput. 
> The "PostgreSQL 9.0 High Performance" book is fantastic and is what I used to great success.
>
> * The default Linux virtual memory configuration is awful for this configuration.  Briefly, it will accept a ton of
incomingdata, and then go through an awful stall as soon as it calls fsync() to write all that data to disk.  We had
multi-seconddelays all the way through to the application because of this.  We had to change the zone_reclaim_mode and
thedirty buffer limits. 
> http://www.postgresql.org/message-id/500616CB.3070408@2ndQuadrant.com
>
>
>
> I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from
tuninga 4xMLC SSD RAID-10.  I haven't proven that this is optimal, but it was way better than the defaults.  We ended
upwith the following list of changes: 
>
> * Change IO scheduler to "noop"
> * Mount DB volume with nobarrier, noatime
> * Turn blockdev readahead to 16MiB
> * Turn sdb's "rotational" tuneable to 0
>
> PostgreSQL configuration changes:
> synchronous_commit = off
> effective_io_concurrency = 4
> checkpoint_segments = 1024
> checkpoint_timeout = 10min
> checkpoint_warning = 8min
> shared_buffers = 32gb
> temp_buffers = 128mb
> work_mem = 512mb
> maintenance_work_mem = 1gb
>
> Linux sysctls:
> vm.swappiness = 0
> vm.zone_reclaim_mode = 0
> vm.dirty_bytes = 134217728
> vm.dirty_background_bytes = 1048576

that's good info, but it should be noted that synchronous_commit
trades a risk of some data loss (but not nearly as much risk as
volatile storage) for a big increase in commit performance.

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Steven Schlansker
Date:
On May 10, 2013, at 11:38 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>> PostgreSQL configuration changes:
>> synchronous_commit = off
>>
>
> that's good info, but it should be noted that synchronous_commit
> trades a risk of some data loss (but not nearly as much risk as
> volatile storage) for a big increase in commit performance.

Yes, that is a choice we consciously made.  If our DB server crashes losing the last few ms worth of transactions is an
acceptableloss to us.  But that may not be OK for everyone :-) 




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Steven Schlansker
Date:
On May 10, 2013, at 11:35 AM, Lonni J Friedman <netllama@gmail.com> wrote:
>>
>> I am not sure that these numbers will end up being anywhere near what works for you, but these are my notes from
tuninga 4xMLC SSD RAID-10.  I haven't proven that this is optimal, but it was way better than the defaults.  We ended
upwith the following list of changes: 
>>
>> * Change IO scheduler to "noop"
>> * Mount DB volume with nobarrier, noatime
>> * Turn blockdev readahead to 16MiB
>> * Turn sdb's "rotational" tuneable to 0
>>
>> PostgreSQL configuration changes:
>> synchronous_commit = off
>> effective_io_concurrency = 4
>> checkpoint_segments = 1024
>> checkpoint_timeout = 10min
>> checkpoint_warning = 8min
>> shared_buffers = 32gb
>> temp_buffers = 128mb
>> work_mem = 512mb
>> maintenance_work_mem = 1gb
>>
>> Linux sysctls:
>> vm.swappiness = 0
>> vm.zone_reclaim_mode = 0
>> vm.dirty_bytes = 134217728
>> vm.dirty_background_bytes = 1048576
>
> Can you provide more details about your setup, including:
> * What kind of filesystem are you using?
> * Linux distro and/or kernel version
> * hardware (RAM, CPU cores etc)
> * database usage patterns (% writes, growth, etc)

Yes, as long as you promise not to just use my configuration without doing proper testing on your own system, even if
itseems similar! 

Linux version 2.6.32.225 (gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC) ) #2 SMP Thu Mar 29 16:43:20 EDT 2012
DMI: Supermicro X8DTN/X8DTN, BIOS 2.1c       10/28/2011
CPU0: Intel(R) Xeon(R) CPU           X5670  @ 2.93GHz stepping 02
Total of 24 processors activated (140796.98 BogoMIPS).        (2 socket x 2 hyperthread x 6 cores)
96GB ECC RAM

Filesystem is ext4 on LVM on hardware RAID 1+0 Adaptec 5405

Database is very much read heavy, but there is a base load of writes and bursts of much larger writes.  I don't have
specificsregarding how it breaks down.  The database is about 400GB and is growing moderately, maybe a few GB/day.
Moreof the write traffic is re-writes rather than writes. 

Hope that helps,
Steven



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Steve Clark
Date:
On 05/10/2013 12:46 PM, Merlin Moncure wrote:
On Fri, May 10, 2013 at 11:34 AM, Evan D. Hoffman
<evandhoffman@gmail.com> wrote:
I'd expect to use a RAID controller with either BBU or NVRAM cache to handle
that, and that the server itself would be on UPS for a production DB.  That
said, a standby replica DB on conventional disk is definitely a good idea in
any case.
Sadly, NVRAM cache doesn't help (unless the raid controller is
managing drive writes down to the flash level and no such products
exist that I am aware of).  The problem is that provide guarantees the
raid controller still needs to be able to tell the device to flush
down to physical storage.  While flash drives can be configured to do
that (basically write-through mode), it's pretty silly to do so as it
will ruin performance and quickly destroy the drive.

Trusting UPS is up to you, but if your ups does, someone knocks the
power cable, etc you have data loss.  With on-drive capacitor you only
get data loss via physical damage or corruption on the drive.

merlin

Well we have dual redundant power supplies on separate UPS so could something go wrong yes, but a tornado could
come along and destroy the building also.


--
Stephen Clark
NetWolves
Director of Technology
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Matt Brock
Date:
On 10 May 2013, at 16:25, David Boreham <david_list@boreham.org> wrote:

> I've never looked at SLC drives in the past few years and don't know anyone who uses them these days.

Because SLCs are still more expensive? Because MLCs are now almost as good as SLCs for performance/endurance?

I should point out that this database will be the backend for a high-transaction gaming site with very heavy database
usageincluding a lot of writes. Disk IO on the database server has always been our bottleneck so far.  

Also, the database is kept comparatively very small - about 25 GB currently, and it will grow to perhaps 50 GB this
yearas a result of new content and traffic coming in. 

So whilst MLCs might be almost as good as SLCs now, the price difference for us is so insignificant that if we can
stillget a small improvement with SLCs then we might as well do so. 

> Could you post some specific drive models please ? HP probably doesn't make the drives, and it really helps to know
whatdevices you're using since they are not nearly as generic in behavior and features as magnetic drives. 

I've asked our HP dealer for this information since unfortunately it doesn't appear to be available on the HP website -
hopefullyit will be forthcoming at some point. 

Matt.




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
"Joshua D. Drake"
Date:
On 05/10/2013 11:38 AM, Merlin Moncure wrote:

>> PostgreSQL configuration changes:
>> synchronous_commit = off
>> effective_io_concurrency = 4
>> checkpoint_segments = 1024
>> checkpoint_timeout = 10min
>> checkpoint_warning = 8min
>> shared_buffers = 32gb
>> temp_buffers = 128mb
>> work_mem = 512mb
>> maintenance_work_mem = 1gb
>
> that's good info, but it should be noted that synchronous_commit
> trades a risk of some data loss (but not nearly as much risk as
> volatile storage) for a big increase in commit performance.

Yeah but it is an extremely low risk, and probably lower than say... a
bad Apache form submission. Generally the database is the most reliable
hardware in the cluster. It is also not a risk for corruption which a
lot of people confuse it for.

One thing I would note is that work_mem is very high, that might be
alright with an SSD environment because if we go out to tape sort, it is
still going to be fast but it is something to consider.

Another thing is, why such a low checkpoint_timout? Set it to 60 minutes
and be done with it. The bgwriter should be dealing with these problems.

Sincerely,

JD


>
> merlin
>
>



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/11/2013 3:10 AM, Matt Brock wrote:
> On 10 May 2013, at 16:25, David Boreham <david_list@boreham.org> wrote:
>
>> I've never looked at SLC drives in the past few years and don't know anyone who uses them these days.
> Because SLCs are still more expensive? Because MLCs are now almost as good as SLCs for performance/endurance?

Not quite. More like : a) I don't know where to buy SLC drives in 2013
(all the drives for example for sale on newegg.com are MLC) and b)
today's MLC drives are quite good enough for me (and I'd venture to say
any database-related purpose).

>
> I should point out that this database will be the backend for a high-transaction gaming site with very heavy database
usageincluding a lot of writes. Disk IO on the database server has always been our bottleneck so far. 

Sure, same here. I wouldn't be replying if all I did was run an SSD
drive in my laptop ;)

>
> Also, the database is kept comparatively very small - about 25 GB currently, and it will grow to perhaps 50 GB this
yearas a result of new content and traffic coming in. 

Our clustering partitions the user population between servers such that
each server has a database about the same size as yours. We tend to use
200 or 300G drives on each box, allowing plenty space for the DB, a copy
of another server's DB, and log files.

>
> I've asked our HP dealer for this information since unfortunately it doesn't appear to be available on the HP website
-hopefully it will be forthcoming at some point. 
>

This is a bit of a red flag for me. During the qualification process for
our SSD drives we: read the technical papers from Intel; ran lab tests
where we saturated a drive with writes for weeks, checking the write
endurance SMART data and operation latency; modified smartools so it
could read all the useful drive counters, and also reset the wear
estimation counters; performed power cable pull tests; read everything
posted on this list by people who had done serious testing in addition
to the tests we ran in house. I'm not sure I'd want to deploy "Joe
Random SSD du jour" that HP decided to ship me. You might consider
buying boxes sans drives and fitting your own, of a known trusted type.




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
John R Pierce
Date:
On 5/12/2013 6:13 PM, David Boreham wrote:
>
> Not quite. More like : a) I don't know where to buy SLC drives in 2013
> (all the drives for example for sale on newegg.com are MLC) and b)
> today's MLC drives are quite good enough for me (and I'd venture to
> say any database-related purpose).

Newegg wouldn't know 'enterprise' if it bit them.   they just sell mass
market consumer stuff and gamer kit.

the real SLC drives end up OEM branded in large SAN systems, such as
sold by Netapp, EMC, and are made by companies like STEC that have zero
presence in the 'whitebox' resale markets like Newegg.




--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
btw we deploy on CentOS6. The only things we change from the default are:

1. add "relatime,discard" options to the mount (check whether the most
recent CentOS6 does this itself -- it didn't back when we first deployed
on 6.0).
2. Disable swap. This isn't strictly an SSD tweak, since we have enough
physical memory to not need to swap, but it is a useful measure for us
since the default install always creates a swap partition which a) uses
valuable space on the smaller-sized SSDs, and b) if there are ever
writes to the swap partition it would be bad for wear on the entire drive.

We also setup monitoring of the drives' smart wear counter to ensure
early warning of any drive coming close to wear out.
We do not use (and don't like) RAID with SSDs.






Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/12/2013 7:20 PM, John R Pierce wrote:

>
> the real SLC drives end up OEM branded in large SAN systems, such as
> sold by Netapp, EMC, and are made by companies like STEC that have
> zero presence in the 'whitebox' resale markets like Newegg.
>

Agreed. I don't go near the likes of Simple, HGST, F-IO, SMART, et al.
For me this is SAS and SCSI re-born -- an excuse to charge very high
prices for a product not significantly different from a much cheaper
mainstream alternative, by exploiting unsophisticated purchasers with
tales of enterprise snake oil.

ymmv, but since I am spending my own $$, I'll stick to product I can
order from the likes of Newegg and Amazon.




Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
John R Pierce
Date:
On 5/12/2013 6:41 PM, David Boreham wrote:
> Agreed. I don't go near the likes of Simple, HGST, F-IO, SMART, et al.
> For me this is SAS and SCSI re-born -- an excuse to charge very high
> prices for a product not significantly different from a much cheaper
> mainstream alternative, by exploiting unsophisticated purchasers with
> tales of enterprise snake oil.
>
> ymmv, but since I am spending my own $$, I'll stick to product I can
> order from the likes of Newegg and Amazon.

except, those high end enterprise products ARE write-safe while most
SATA SSD's aren't.   The enterprise SSD's like STEC are engineered for
much higher write cycle life times.   the whole storage infrastructure
has several more 9's on its reliability.   those extra 9's don't come
cheap.   and if [Bigname Vendor] sells you a system with storage
controller X and drives Y, you can generally assume they've been tested
together.    they are also selling their service and support, for better
or worse.

SAS has better IO concurrency than SATA, and SAS drives are dual path,
which means you can have redundant storage channels and cabling, even if
you're doing JBOD..

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Sun, May 12, 2013 at 8:20 PM, John R Pierce <pierce@hogranch.com> wrote:
> On 5/12/2013 6:13 PM, David Boreham wrote:
>>
>>
>> Not quite. More like : a) I don't know where to buy SLC drives in 2013
>> (all the drives for example for sale on newegg.com are MLC) and b) today's
>> MLC drives are quite good enough for me (and I'd venture to say any
>> database-related purpose).
>
>
> Newegg wouldn't know 'enterprise' if it bit them.   they just sell mass
> market consumer stuff and gamer kit.
>
> the real SLC drives end up OEM branded in large SAN systems, such as sold by
> Netapp, EMC, and are made by companies like STEC that have zero presence in
> the 'whitebox' resale markets like Newegg.


The industry decided a while back that MLC was basically the way to go
in terms of cost and engineering trade-offs, at least in cases where
you needed a lot of storage. Yes, you can still get SLC in mid-tier
and up storage but:

*) a lot of these drives are simply re-branded intel etc
*) When it comes to SSD, I have zero confidence in vendor provided
hardware specs (lifetime, iops, etc).  The lack of 3rd party test
coverage and performance benchmarking is a big problem for me.  Ever
bought a SAN and have had it not do what it was supposed to?
*) The faster moving white box market has chosen MLC.  Three years
back, the jury was still out.  This suggests to me that SAN vendors
are still behind the curve in terms of SSD, which is typical of
enterprise storage vendors. But,
*) In many cases, the performance of the latest MLC drives is so fast
that many applications that would have needed to scale up to high end
storage would no longer need to do so.   A software raid of say for
s3700 drives would probably outperform most <100k SANs from a couple
years back.

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Matt Brock
Date:
So a week after asking our HP dealer, they've finally replied to say that they can't tell us what manufacturer and
modelthe SSDs are because "HP treat this information as company confidential". Not particularly helpful. 

They have at least confirmed that the drives have "surprise power loss protection" and "tools to present information on
thepercentage of life used and amount of life remaining under the workload-to-date". 

Given that these are enterprise class drives, and given that they have the high availability features that we would
needin database servers, and given that the deadline on this project is very tight so I don't really have time to do
anytesting on third-party drives, I'm guessing we'll go with the HP drives, even though they most likely are a little
behindthe times. Whilst we will perhaps lose in a little bit of performance compared to the latest Intel drives, we
willgain in terms of high availability reassurance and simplicity of deployment which is crucial for this project given
itstight deadline. However, after going through all the advice on this thread and having had time to think, I'll
probablygo for a four-disk RAID 10 array with SLCs, rather than a two-disk RAID 1 array with MLCs (for the OS) and a
two-diskRAID 1 array with SLCs (for the database). 

If I had more time and resources for testing I would likely end up going a different route, however.

Many thanks to all who've contributed their thoughts and opinions - much appreciated.

Matt.

On 13 May 2013, at 14:49, Merlin Moncure <mmoncure@gmail.com> wrote:

> On Sun, May 12, 2013 at 8:20 PM, John R Pierce <pierce@hogranch.com> wrote:
>> On 5/12/2013 6:13 PM, David Boreham wrote:
>>>
>>>
>>> Not quite. More like : a) I don't know where to buy SLC drives in 2013
>>> (all the drives for example for sale on newegg.com are MLC) and b) today's
>>> MLC drives are quite good enough for me (and I'd venture to say any
>>> database-related purpose).
>>
>>
>> Newegg wouldn't know 'enterprise' if it bit them.   they just sell mass
>> market consumer stuff and gamer kit.
>>
>> the real SLC drives end up OEM branded in large SAN systems, such as sold by
>> Netapp, EMC, and are made by companies like STEC that have zero presence in
>> the 'whitebox' resale markets like Newegg.
>
>
> The industry decided a while back that MLC was basically the way to go
> in terms of cost and engineering trade-offs, at least in cases where
> you needed a lot of storage. Yes, you can still get SLC in mid-tier
> and up storage but:
>
> *) a lot of these drives are simply re-branded intel etc
> *) When it comes to SSD, I have zero confidence in vendor provided
> hardware specs (lifetime, iops, etc).  The lack of 3rd party test
> coverage and performance benchmarking is a big problem for me.  Ever
> bought a SAN and have had it not do what it was supposed to?
> *) The faster moving white box market has chosen MLC.  Three years
> back, the jury was still out.  This suggests to me that SAN vendors
> are still behind the curve in terms of SSD, which is typical of
> enterprise storage vendors. But,
> *) In many cases, the performance of the latest MLC drives is so fast
> that many applications that would have needed to scale up to high end
> storage would no longer need to do so.   A software raid of say for
> s3700 drives would probably outperform most <100k SANs from a couple
> years back.
>
> merlin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Toby Corkindale
Date:
On 11/05/13 02:25, Merlin Moncure wrote:
> On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman
> <evandhoffman@gmail.com> wrote:
>> Not sure of your space requirements, but I'd think a RAID 10 of 8x or more
>> Samsung 840 Pro 256/512 GB would be the best value.  Using a simple mirror
>> won't get you the reliability that you want since heavy writing will burn
>> the drives out over time, and if you're writing the exact same content to
>> both drives, they could likely fail at the same time.  Regardless of the
>> underlying hardware you should still follow best practices for provisioning
>> disks, and raid 10 is the way to go.  I don't know what your budget is
>> though.  Anyway, mirrored SSD will probably work fine, but I'd avoid using
>> just two drives for the reasons above.  I'd suggest at least testing RAID 5
>> or something else to spread the load around.  Personally, I think the ideal
>> configuration would be a RAID 10 of at least 8 disks plus 1 hot spare.  The
>> Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg.  YMMV but they
>> are amazing drives.
>
> Samsung 840 has no power loss protection and is therefore useless for
> database use IMO unless you don't care about data safety and/or are
> implementing redundancy via some other method (say, by synchronous
> replication).


I believe the original poster was referring to the "840 Pro" model; that
model does include a "supercap" for power loss protection.

-Toby


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Toby Corkindale
Date:
On 13/05/13 11:23, David Boreham wrote:
> btw we deploy on CentOS6. The only things we change from the default are:
>
> 1. add "relatime,discard" options to the mount (check whether the most
> recent CentOS6 does this itself -- it didn't back when we first deployed
> on 6.0).


While it is important to let the SSD know about space that can be
reclaimed, I gather the operation does not perform well.
I *think* current advice is to leave 'discard' off the mount options,
and instead run a nightly cron job to call 'fstrim' on the mount point
instead. (In really high write situations, you'd be looking at calling
that every hour instead I suppose)

I have to admit to have just gone with the advice, rather than
benchmarking it thoroughly.

tjc


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
David Boreham
Date:
On 5/19/2013 7:19 PM, Toby Corkindale wrote:
> On 13/05/13 11:23, David Boreham wrote:
>> btw we deploy on CentOS6. The only things we change from the default
>> are:
>>
>> 1. add "relatime,discard" options to the mount (check whether the most
>> recent CentOS6 does this itself -- it didn't back when we first deployed
>> on 6.0).
>
>
> While it is important to let the SSD know about space that can be
> reclaimed, I gather the operation does not perform well.
> I *think* current advice is to leave 'discard' off the mount options,
> and instead run a nightly cron job to call 'fstrim' on the mount point
> instead. (In really high write situations, you'd be looking at calling
> that every hour instead I suppose)
>
> I have to admit to have just gone with the advice, rather than
> benchmarking it thoroughly.


The guy who blogged about this a couple of years ago was using a
Sandforce controller drive.
I'm not sure there is a similar issue with other drives. Certainly we've
never noticed a problematic delay in file deletes.
That said, our applications don't delete files too often (log file
purging is probably the only place it happens regularly).

Personally, in the absence of a clear and present issue, I'd prefer to
go the "kernel guys and drive firmware guys will take care of this"
route, and just enable discard on the mount.







Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Merlin Moncure
Date:
On Sun, May 19, 2013 at 8:07 PM, Toby Corkindale
<toby.corkindale@strategicdata.com.au> wrote:
> On 11/05/13 02:25, Merlin Moncure wrote:
>>
>> On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman
>> <evandhoffman@gmail.com> wrote:
>>>
>>> Not sure of your space requirements, but I'd think a RAID 10 of 8x or
>>> more
>>> Samsung 840 Pro 256/512 GB would be the best value.  Using a simple
>>> mirror
>>> won't get you the reliability that you want since heavy writing will burn
>>> the drives out over time, and if you're writing the exact same content to
>>> both drives, they could likely fail at the same time.  Regardless of the
>>> underlying hardware you should still follow best practices for
>>> provisioning
>>> disks, and raid 10 is the way to go.  I don't know what your budget is
>>> though.  Anyway, mirrored SSD will probably work fine, but I'd avoid
>>> using
>>> just two drives for the reasons above.  I'd suggest at least testing RAID
>>> 5
>>> or something else to spread the load around.  Personally, I think the
>>> ideal
>>> configuration would be a RAID 10 of at least 8 disks plus 1 hot spare.
>>> The
>>> Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg.  YMMV but
>>> they
>>> are amazing drives.
>>
>>
>> Samsung 840 has no power loss protection and is therefore useless for
>> database use IMO unless you don't care about data safety and/or are
>> implementing redundancy via some other method (say, by synchronous
>> replication).
>
>
>
> I believe the original poster was referring to the "840 Pro" model; that
> model does include a "supercap" for power loss protection.

got a source for that?  I couldn't verify that after some googling.

merlin


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Toby Corkindale
Date:
On 20/05/13 15:12, David Boreham wrote:
> On 5/19/2013 7:19 PM, Toby Corkindale wrote:
>> On 13/05/13 11:23, David Boreham wrote:
>>> btw we deploy on CentOS6. The only things we change from the default
>>> are:
>>>
>>> 1. add "relatime,discard" options to the mount (check whether the most
>>> recent CentOS6 does this itself -- it didn't back when we first deployed
>>> on 6.0).
>>
>>
>> While it is important to let the SSD know about space that can be
>> reclaimed, I gather the operation does not perform well.
>> I *think* current advice is to leave 'discard' off the mount options,
>> and instead run a nightly cron job to call 'fstrim' on the mount point
>> instead. (In really high write situations, you'd be looking at calling
>> that every hour instead I suppose)
>>
>> I have to admit to have just gone with the advice, rather than
>> benchmarking it thoroughly.
>
>
> The guy who blogged about this a couple of years ago was using a
> Sandforce controller drive.
> I'm not sure there is a similar issue with other drives. Certainly we've
> never noticed a problematic delay in file deletes.
> That said, our applications don't delete files too often (log file
> purging is probably the only place it happens regularly).
>
> Personally, in the absence of a clear and present issue, I'd prefer to
> go the "kernel guys and drive firmware guys will take care of this"
> route, and just enable discard on the mount.

This guy posted about a number of SSD drives, and enabling discard
affected most of them quite negatively:
http://people.redhat.com/lczerner/discard/ext4_discard.html
http://people.redhat.com/lczerner/discard/files/Performance_evaluation_of_Linux_DIscard_support_Dev_Con2011_Brno.pdf

That is from 2011 though, so you're right that things may have improved
by now.. Has anyone seen benchmarks supporting that though?

Toby


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
Toby Corkindale
Date:
On 21/05/13 00:16, Merlin Moncure wrote:
> On Sun, May 19, 2013 at 8:07 PM, Toby Corkindale
> <toby.corkindale@strategicdata.com.au> wrote:
>> On 11/05/13 02:25, Merlin Moncure wrote:
>>>
>>> On Fri, May 10, 2013 at 11:11 AM, Evan D. Hoffman
>>> <evandhoffman@gmail.com> wrote:
>>>>
>>>> Not sure of your space requirements, but I'd think a RAID 10 of 8x or
>>>> more
>>>> Samsung 840 Pro 256/512 GB would be the best value.  Using a simple
>>>> mirror
>>>> won't get you the reliability that you want since heavy writing will burn
>>>> the drives out over time, and if you're writing the exact same content to
>>>> both drives, they could likely fail at the same time.  Regardless of the
>>>> underlying hardware you should still follow best practices for
>>>> provisioning
>>>> disks, and raid 10 is the way to go.  I don't know what your budget is
>>>> though.  Anyway, mirrored SSD will probably work fine, but I'd avoid
>>>> using
>>>> just two drives for the reasons above.  I'd suggest at least testing RAID
>>>> 5
>>>> or something else to spread the load around.  Personally, I think the
>>>> ideal
>>>> configuration would be a RAID 10 of at least 8 disks plus 1 hot spare.
>>>> The
>>>> Samsung 840 Pro 256 GB are frequently $200 on sale at Newegg.  YMMV but
>>>> they
>>>> are amazing drives.
>>>
>>>
>>> Samsung 840 has no power loss protection and is therefore useless for
>>> database use IMO unless you don't care about data safety and/or are
>>> implementing redundancy via some other method (say, by synchronous
>>> replication).
>>
>>
>>
>> I believe the original poster was referring to the "840 Pro" model; that
>> model does include a "supercap" for power loss protection.
>
> got a source for that?  I couldn't verify that after some googling.


I'm sorry, I really thought they had made it onto my list of candidates
that included supercaps.. now I'm checking again, I can't find any
evidence to support that claim either. I must have confused them in my
mind with another drive. Sorry about that, and thanks for checking.

-Toby


Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From
"Holger Hoffstaette"
Date:
On Tue, 21 May 2013 11:40:55 +1000, Toby Corkindale wrote:

>>> While it is important to let the SSD know about space that can be
>>> reclaimed, I gather the operation does not perform well. I *think*
>>> current advice is to leave 'discard' off the mount options, and instead
>>> run a nightly cron job to call 'fstrim' on the mount point instead. (In
>>> really high write situations, you'd be looking at calling that every
>>> hour instead I suppose)

This is still a good idea - see below.

>> The guy who blogged about this a couple of years ago was using a
>> Sandforce controller drive.

Btw that doesn't mean anything (neither in terms of performance nor
stability), since "the controller" also needs to be paired with an - often
vendor-dependent - firmware, which is much more relevant. Since LSI
acquired Sandforce this situation has gotten much better (unified
upstream).

>> I'm not sure there is a similar issue with other drives. Certainly we've

There is (now), because..

>> never noticed a problematic delay in file deletes. That said, our
>> applications don't delete files too often (log file purging is probably
>> the only place it happens regularly).
>>
>> Personally, in the absence of a clear and present issue, I'd prefer to
>> go the "kernel guys and drive firmware guys will take care of this"
>> route, and just enable discard on the mount.

Nope, wrong, because.. (..getting there :)

> That is from 2011 though, so you're right that things may have improved by
> now.. Has anyone seen benchmarks supporting that though?

Unfortunately since 3.8 discards are issued as synchronous commands,
effectively disabling any scheduling/merging etc. The result can be seen
easily:

- mount drive without discard using kernel >= 3.8
- unpack kernel source
- time delete of entire tree

- remount with discard
- unpack kernel tree
- start delete of tree
- ...
- check it hasn't crashed
- ...
- go plant a tree or make babies while waiting for it to finish

Online discard has gotten so slow that it's now a good idea to turn off
for anything but light write workloads. Metadata-heavy writes are
obviously the worst case.

I experienced this on Samsung, Intel & a Sandforce-based drives, so "the
controller" is no longer the primary reason for the performance impact.
Extremely enterprisey drives *might* behave slightly better, but I doubt
it; flash erase cycles are what they are.

-h