Thread: Latest advice on SSD?
One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.
I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.
We're replacing:
8 core (Intel)
48GB memory
12-drive 7200 RPM 500GB RAID1 (2 disks, OS and WAL log)
RAID10 (8 disks, postgres data dir)
2 spares
Ubuntu 16.04
Postgres 9.6
The current system peaks at about 7000 TPS from pgbench.
Our system is a mix of non-transactional searching (customers) and transactional data loading (us).
Thanks!
Craig
--
eMolecules, Inc.
---------------------------------
---------------------------------
Craig A. James
Chief Technology OfficerCraig A. James
På tirsdag 10. april 2018 kl. 04:36:27, skrev Craig James <cjames@emolecules.com>:
One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.
With what arguments (also initialization)?
--
Andreas Joseph Krogh
On Tue, Apr 10, 2018 at 12:21 AM, Andreas Joseph Krogh <andreas@visena.com> wrote:
På tirsdag 10. april 2018 kl. 04:36:27, skrev Craig James <cjames@emolecules.com>:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.With what arguments (also initialization)?
pgbench -i -s 100 -U test
pgbench -U test -c ... -t ...
-c -t TPS
5 20000 5202
10 10000 7916
20 5000 7924
30 3333 7270
40 2500 5020
50 2000 6417
--Andreas Joseph Krogh
---------------------------------
Craig A. James
Chief Technology OfficerCraig A. James
You don't mention the size of your database. Does it fit in memory? If so your disks aren't going to matter a whole lot outside of potentially being i/o bound on the writes. Otherwise getting your data into SSDs absolutely can have a few multiples of performance impact. The NVME M.2 drives can really pump out the data. Maybe push your WAL onto those (as few motherboards have more than two connectors) and use regular SSDs for your data if you have high write rates.
Meanwhile, if you're looking for strong cloud hosting for Postgres but the speed of physical hardware, feel free to contact me as my company does this for some companies who found i/o limits on regular cloud providers to be way too slow for their needs.
good luck (and pardon the crass commercial comments!),
-- Ben Scherrey
On Tue, Apr 10, 2018 at 9:36 AM, Craig James <cjames@emolecules.com> wrote:
One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig--------------------------------eMolecules, Inc.------------------------------Chief Technology Officer---
Craig A. James---
RDBMS such as pg are beasts that turn random IO requests, traditionally slow in spinning drives, into sequential. WAL is a good example of this.
SSDs are generally slower than spinning at sequential IO and way faster at random.
Expect therefore for SSD to help if you are random IO bound. (Some cloud vendors offer SSD as a way to get dedicated local io and bandwidth - so sometimes it helps stablize performance vs. virtualized shared io.)
A REASONABLE PERSON SHOULD ASSUME THAT UNBENCHMARKED AND UNRESEARCHED MIGRATION FROM TUNED SPINNING TO SSD WILL SLOW YOU DOWN
/Aaron
You don't mention the size of your database. Does it fit in memory? If so your disks aren't going to matter a whole lot outside of potentially being i/o bound on the writes. Otherwise getting your data into SSDs absolutely can have a few multiples of performance impact. The NVME M.2 drives can really pump out the data. Maybe push your WAL onto those (as few motherboards have more than two connectors) and use regular SSDs for your data if you have high write rates.Meanwhile, if you're looking for strong cloud hosting for Postgres but the speed of physical hardware, feel free to contact me as my company does this for some companies who found i/o limits on regular cloud providers to be way too slow for their needs.good luck (and pardon the crass commercial comments!),-- Ben ScherreyOn Tue, Apr 10, 2018 at 9:36 AM, Craig James <cjames@emolecules.com> wrote:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig--------------------------------eMolecules, Inc.------------------------------Chief Technology Officer---
Craig A. James---
> SSDs are generally slower than spinning at sequential IO and way faster at random.
Unreleased yet Seagate HDD boasts 480MB/s sequential read speed [1], and no HDD now can achieve that.
Even SATA-3 SSD's could be faster than that for years now (550MB/s are quite typical), and NVME ones could be easily faster than 1GB/s and up to 3GB/s+.
I'm curious to know where are you drawing these conclusions from?
2018-04-10 22:00 GMT+03:00 Aaron <aaron.werman@gmail.com>:
RDBMS such as pg are beasts that turn random IO requests, traditionally slow in spinning drives, into sequential. WAL is a good example of this.SSDs are generally slower than spinning at sequential IO and way faster at random.Expect therefore for SSD to help if you are random IO bound. (Some cloud vendors offer SSD as a way to get dedicated local io and bandwidth - so sometimes it helps stablize performance vs. virtualized shared io.)A REASONABLE PERSON SHOULD ASSUME THAT UNBENCHMARKED AND UNRESEARCHED MIGRATION FROM TUNED SPINNING TO SSD WILL SLOW YOU DOWN/AaronYou don't mention the size of your database. Does it fit in memory? If so your disks aren't going to matter a whole lot outside of potentially being i/o bound on the writes. Otherwise getting your data into SSDs absolutely can have a few multiples of performance impact. The NVME M.2 drives can really pump out the data. Maybe push your WAL onto those (as few motherboards have more than two connectors) and use regular SSDs for your data if you have high write rates.Meanwhile, if you're looking for strong cloud hosting for Postgres but the speed of physical hardware, feel free to contact me as my company does this for some companies who found i/o limits on regular cloud providers to be way too slow for their needs.good luck (and pardon the crass commercial comments!),-- Ben ScherreyOn Tue, Apr 10, 2018 at 9:36 AM, Craig James <cjames@emolecules.com> wrote:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig--------------------------------eMolecules, Inc.------------------------------Chief Technology Officer---
Craig A. James---
On Apr 10, 2018, at 3:11 PM, Dmitry Shalashov <skaurus@gmail.com> wrote:> SSDs are generally slower than spinning at sequential IO and way faster at random.Unreleased yet Seagate HDD boasts 480MB/s sequential read speed [1], and no HDD now can achieve that.Even SATA-3 SSD's could be faster than that for years now (550MB/s are quite typical), and NVME ones could be easily faster than 1GB/s and up to 3GB/s+.I'm curious to know where are you drawing these conclusions from?
Yeah, that sequential info sounds weird.
I’m only chiming in because I just setup one of those SoHo NAS boxes (Qnap) and it had both SSDs and HDDs installed. This was to be used for video editing, so it’s almost all sequential reads/writes. On 10Gb/s ethernet sequential reads off the cached content on the SSDs was somewhere around 800MB/s. These were non-enterprise SSDs.
2018-04-10 22:00 GMT+03:00 Aaron <aaron.werman@gmail.com>:RDBMS such as pg are beasts that turn random IO requests, traditionally slow in spinning drives, into sequential. WAL is a good example of this.SSDs are generally slower than spinning at sequential IO and way faster at random.Expect therefore for SSD to help if you are random IO bound. (Some cloud vendors offer SSD as a way to get dedicated local io and bandwidth - so sometimes it helps stablize performance vs. virtualized shared io.)A REASONABLE PERSON SHOULD ASSUME THAT UNBENCHMARKED AND UNRESEARCHED MIGRATION FROM TUNED SPINNING TO SSD WILL SLOW YOU DOWN/AaronYou don't mention the size of your database. Does it fit in memory? If so your disks aren't going to matter a whole lot outside of potentially being i/o bound on the writes. Otherwise getting your data into SSDs absolutely can have a few multiples of performance impact. The NVME M.2 drives can really pump out the data. Maybe push your WAL onto those (as few motherboards have more than two connectors) and use regular SSDs for your data if you have high write rates.Meanwhile, if you're looking for strong cloud hosting for Postgres but the speed of physical hardware, feel free to contact me as my company does this for some companies who found i/o limits on regular cloud providers to be way too slow for their needs.good luck (and pardon the crass commercial comments!),-- Ben ScherreyOn Tue, Apr 10, 2018 at 9:36 AM, Craig James <cjames@emolecules.com> wrote:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig--------------------------------eMolecules, Inc.------------------------------Chief Technology Officer---
Craig A. James---
Well, I can give a measurement on my home PC, a Linux box running Ubuntu 17.10 with a Samsung 960 EVO 512GB NVME disk containing Postgres 10. Using your pgbench init I got for example:
pgbench -c 10 -t 10000 test
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
latency average = 0.679 ms
tps = 14730.402329 (including connections establishing)
tps = 14733.000950 (excluding connections establishing)
I will try to run a test on our production system which has a pair of Intel DC P4600 2TB in RAID0 tomorrow.
On Tue, Apr 10, 2018 at 9:58 PM Charles Sprickman <css@morefoo.com> wrote:
On Apr 10, 2018, at 3:11 PM, Dmitry Shalashov <skaurus@gmail.com> wrote:> SSDs are generally slower than spinning at sequential IO and way faster at random.Unreleased yet Seagate HDD boasts 480MB/s sequential read speed [1], and no HDD now can achieve that.Even SATA-3 SSD's could be faster than that for years now (550MB/s are quite typical), and NVME ones could be easily faster than 1GB/s and up to 3GB/s+.I'm curious to know where are you drawing these conclusions from?Yeah, that sequential info sounds weird.I’m only chiming in because I just setup one of those SoHo NAS boxes (Qnap) and it had both SSDs and HDDs installed. This was to be used for video editing, so it’s almost all sequential reads/writes. On 10Gb/s ethernet sequential reads off the cached content on the SSDs was somewhere around 800MB/s. These were non-enterprise SSDs.Charles2018-04-10 22:00 GMT+03:00 Aaron <aaron.werman@gmail.com>:RDBMS such as pg are beasts that turn random IO requests, traditionally slow in spinning drives, into sequential. WAL is a good example of this.SSDs are generally slower than spinning at sequential IO and way faster at random.Expect therefore for SSD to help if you are random IO bound. (Some cloud vendors offer SSD as a way to get dedicated local io and bandwidth - so sometimes it helps stablize performance vs. virtualized shared io.)A REASONABLE PERSON SHOULD ASSUME THAT UNBENCHMARKED AND UNRESEARCHED MIGRATION FROM TUNED SPINNING TO SSD WILL SLOW YOU DOWN/AaronYou don't mention the size of your database. Does it fit in memory? If so your disks aren't going to matter a whole lot outside of potentially being i/o bound on the writes. Otherwise getting your data into SSDs absolutely can have a few multiples of performance impact. The NVME M.2 drives can really pump out the data. Maybe push your WAL onto those (as few motherboards have more than two connectors) and use regular SSDs for your data if you have high write rates.Meanwhile, if you're looking for strong cloud hosting for Postgres but the speed of physical hardware, feel free to contact me as my company does this for some companies who found i/o limits on regular cloud providers to be way too slow for their needs.good luck (and pardon the crass commercial comments!),-- Ben ScherreyOn Tue, Apr 10, 2018 at 9:36 AM, Craig James <cjames@emolecules.com> wrote:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig-----------------------------------eMolecules, Inc.---------------------------------Chief Technology Officer
Craig A. James
We have been using the Intel S3710 (or minor model variations thereof). They have been great (consistent performance, power off safe and good expected lifetime). Also 2 of them in RAID1 easily outperform a reasonably large number of 10K spinners in RAID10.
Now you *can* still buy the S37xx series, but eventually I guess we'll have to look at something more modern like the S45xx series. But I'm not so keen on them (they use TLC NAND which may give less consistent performance, plus they appear to have slightly lower expected lifetime). I think there was a thread a year or more ago on this list specifically about this very issue that might be worth searching for.
The TLC NAND seems like a big deal - most modern SSD are built using it...they solve the high latency problem with SLC caches. So you get brilliant performance until the cache is full, then it drops off a cliff. Bigger/more expensive drives have bigger caches, so it is well worth finding in depth reviews of the exact models you might wish to evaluate!
regards
Mark
On 10/04/18 14:36, Craig James wrote:
Now you *can* still buy the S37xx series, but eventually I guess we'll have to look at something more modern like the S45xx series. But I'm not so keen on them (they use TLC NAND which may give less consistent performance, plus they appear to have slightly lower expected lifetime). I think there was a thread a year or more ago on this list specifically about this very issue that might be worth searching for.
The TLC NAND seems like a big deal - most modern SSD are built using it...they solve the high latency problem with SLC caches. So you get brilliant performance until the cache is full, then it drops off a cliff. Bigger/more expensive drives have bigger caches, so it is well worth finding in depth reviews of the exact models you might wish to evaluate!
regards
Mark
On 10/04/18 14:36, Craig James wrote:
One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig
The most critical bit of advice I've found is setting this preference:
I'm using 4 512GB Samsung 850 EVOs in a hardware RAID 10 on a 1U server with about 144 GB RAM and 8 Xeon cores. I usually burn up CPU more than I burn up disks or RAM as compared to using magnetic where I had horrible IO wait percentages, so it seems to be performing quite well so far.
Matthew Hall
One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.Our system is a mix of non-transactional searching (customers) and transactional data loading (us).Thanks!Craig-----------------------------------eMolecules, Inc.---------------------------------Chief Technology Officer
Craig A. James
The 512 Gb model is big enough that the SLC cache and performance is gonna be ok. What would worry me is the lifetime: individual 512 Gb 850 EVOs are rated at 150 Tb over 5 years. Compare that to the Intel S3710 - 400 Gb is rated at 8 Pb over 5 years. These drives are fast enough so that you *might* write more than 4x 150 = 600 Tb over 5 years... In addition - Samsung are real cagey about the power loss reliability of these drives - I suspect that if you do lose power unexpectedly then data corruption will result (no capacitors to keep RAM cache in sync). regards Mark On 11/04/18 13:39, Matthew Hall wrote: > The most critical bit of advice I've found is setting this preference: > > https://amplitude.engineering/how-a-single-postgresql-config-change-improved-slow-query-performance-by-50x-85593b8991b0 > > I'm using 4 512GB Samsung 850 EVOs in a hardware RAID 10 on a 1U > server with about 144 GB RAM and 8 Xeon cores. I usually burn up CPU > more than I burn up disks or RAM as compared to using magnetic where I > had horrible IO wait percentages, so it seems to be performing quite > well so far. > > Matthew Hall > > On Apr 9, 2018, at 7:36 PM, Craig James <cjames@emolecules.com > <mailto:cjames@emolecules.com>> wrote: > >> One of our four "big iron" (spinning disks) servers went belly up >> today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're >> planning to move to a cloud service at the end of the year, so bad >> timing on this. We didn't want to buy any more hardware, but now it >> looks like we have to. >> >> I followed the discussions about SSD drives when they were first >> becoming mainstream; at that time, the Intel devices were king. Can >> anyone recommend what's a good SSD configuration these days? I don't >> think we want to buy a new server with spinning disks. >> >> We're replacing: >> 8 core (Intel) >> 48GB memory >> 12-drive 7200 RPM 500GB >> RAID1 (2 disks, OS and WAL log) >> RAID10 (8 disks, postgres data dir) >> 2 spares >> Ubuntu 16.04 >> Postgres 9.6 >> >> The current system peaks at about 7000 TPS from pgbench. >> >> Our system is a mix of non-transactional searching (customers) and >> transactional data loading (us). >> >> Thanks! >> Craig >> >> -- >> --------------------------------- >> Craig A. James >> Chief Technology Officer >> eMolecules, Inc. >> ---------------------------------
On Thu, Apr 12, 2018 at 8:11 AM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote: > The 512 Gb model is big enough that the SLC cache and performance is gonna > be ok. What would worry me is the lifetime: individual 512 Gb 850 EVOs are > rated at 150 Tb over 5 years. Compare that to the Intel S3710 - 400 Gb is > rated at 8 Pb over 5 years. These drives are fast enough so that you *might* > write more than 4x 150 = 600 Tb over 5 years... > > > In addition - Samsung are real cagey about the power loss reliability of > these drives - I suspect that if you do lose power unexpectedly then data > corruption will result (no capacitors to keep RAM cache in sync). I have done a lot of pull-the-plug testing on Samsung 850 M2 drives as a side effect of a HA demo setup. I haven't kept any numbers, but on a tiny database with a smallish 100tps workload I am seeing data corruption in about 1% of cases. Things like empty pg_control files, sections of WAL replaced with zeroes and/or old data. OS level write cache tuning is not enough to get rid of it. Based on that and the fact that interrupting SSD garbage collection might also cause data loss, my recommendation is to either avoid consumer drives for important databases. Or if you are adventurous have multiple replicas in different power domains and have operational procedures in place to reimage hosts on power loss. -- Ants Aasma Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
På tirsdag 10. april 2018 kl. 19:41:59, skrev Craig James <cjames@emolecules.com>:
On Tue, Apr 10, 2018 at 12:21 AM, Andreas Joseph Krogh <andreas@visena.com> wrote:På tirsdag 10. april 2018 kl. 04:36:27, skrev Craig James <cjames@emolecules.com>:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.With what arguments (also initialization)?pgbench -i -s 100 -U testpgbench -U test -c ... -t ...-c -t TPS5 20000 520210 10000 791620 5000 792430 3333 727040 2500 502050 2000 6417
FWIW; We're testing this: https://www.supermicro.nl/products/system/1U/1029/SYS-1029U-TN10RT.cfm
with 4 x Micron NVMe 9200 PRO NVMe 3.84TB U.2 in RAID-10:
$ pgbench -s 100 -c 64 -t 10000 pgbench
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 640000/640000
latency average = 2.867 ms
tps = 22320.942063 (including connections establishing)
tps = 22326.370955 (excluding connections establishing)
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 640000/640000
latency average = 2.867 ms
tps = 22320.942063 (including connections establishing)
tps = 22326.370955 (excluding connections establishing)
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
På onsdag 09. mai 2018 kl. 22:00:16, skrev Andreas Joseph Krogh <andreas@visena.com>:
På tirsdag 10. april 2018 kl. 19:41:59, skrev Craig James <cjames@emolecules.com>:On Tue, Apr 10, 2018 at 12:21 AM, Andreas Joseph Krogh <andreas@visena.com> wrote:På tirsdag 10. april 2018 kl. 04:36:27, skrev Craig James <cjames@emolecules.com>:One of our four "big iron" (spinning disks) servers went belly up today. (Thanks, Postgres and pgbackrest! Easy recovery.) We're planning to move to a cloud service at the end of the year, so bad timing on this. We didn't want to buy any more hardware, but now it looks like we have to.I followed the discussions about SSD drives when they were first becoming mainstream; at that time, the Intel devices were king. Can anyone recommend what's a good SSD configuration these days? I don't think we want to buy a new server with spinning disks.We're replacing:8 core (Intel)48GB memory12-drive 7200 RPM 500GBRAID1 (2 disks, OS and WAL log)RAID10 (8 disks, postgres data dir)2 sparesUbuntu 16.04Postgres 9.6The current system peaks at about 7000 TPS from pgbench.With what arguments (also initialization)?pgbench -i -s 100 -U testpgbench -U test -c ... -t ...-c -t TPS5 20000 520210 10000 791620 5000 792430 3333 727040 2500 502050 2000 6417FWIW; We're testing this: https://www.supermicro.nl/products/system/1U/1029/SYS-1029U-TN10RT.cfmwith 4 x Micron NVMe 9200 PRO NVMe 3.84TB U.2 in RAID-10:$ pgbench -s 100 -c 64 -t 10000 pgbench
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 640000/640000
latency average = 2.867 ms
tps = 22320.942063 (including connections establishing)
tps = 22326.370955 (excluding connections establishing)
Sorry, wrong disks; this is correct:
48 clients:
pgbench -s 100 -c 48 -t 10000 pgbench
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 48
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 480000/480000
latency average = 1.608 ms
tps = 29846.511054 (including connections establishing)
tps = 29859.483666 (excluding connections establishing)
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 48
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 480000/480000
latency average = 1.608 ms
tps = 29846.511054 (including connections establishing)
tps = 29859.483666 (excluding connections establishing)
64 clients:
pgbench -s 100 -c 64 -t 10000 pgbench
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 640000/640000
latency average = 2.279 ms
tps = 28077.261708 (including connections establishing)
tps = 28085.730160 (excluding connections establishing)
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 640000/640000
latency average = 2.279 ms
tps = 28077.261708 (including connections establishing)
tps = 28085.730160 (excluding connections establishing)
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On 11/05/18 23:23, Andreas Joseph Krogh wrote: > På onsdag 09. mai 2018 kl. 22:00:16, skrev Andreas Joseph Krogh > <andreas@visena.com <mailto:andreas@visena.com>>: > > På tirsdag 10. april 2018 kl. 19:41:59, skrev Craig James > <cjames@emolecules.com <mailto:cjames@emolecules.com>>: > > On Tue, Apr 10, 2018 at 12:21 AM, Andreas Joseph Krogh > <andreas@visena.com <mailto:andreas@visena.com>> wrote: > > På tirsdag 10. april 2018 kl. 04:36:27, skrev Craig James > <cjames@emolecules.com <mailto:cjames@emolecules.com>>: > > One of our four "big iron" (spinning disks) servers > went belly up today. (Thanks, Postgres and pgbackrest! > Easy recovery.) We're planning to move to a cloud > service at the end of the year, so bad timing on this. > We didn't want to buy any more hardware, but now it > looks like we have to. > I followed the discussions about SSD drives when they > were first becoming mainstream; at that time, the > Intel devices were king. Can anyone recommend what's a > good SSD configuration these days? I don't think we > want to buy a new server with spinning disks. > We're replacing: > 8 core (Intel) > 48GB memory > 12-drive 7200 RPM 500GB > RAID1 (2 disks, OS and WAL log) > RAID10 (8 disks, postgres data dir) > 2 spares > Ubuntu 16.04 > Postgres 9.6 > The current system peaks at about 7000 TPS from pgbench. > > With what arguments (also initialization)? > > pgbench -i -s 100 -U test > pgbench -U test -c ... -t ... > -c -t TPS > 5 20000 5202 > 10 10000 7916 > 20 5000 7924 > 30 3333 7270 > 40 2500 5020 > 50 2000 6417 > > FWIW; We're testing > this: https://www.supermicro.nl/products/system/1U/1029/SYS-1029U-TN10RT.cfm > with 4 x Micron NVMe 9200 PRO NVMe 3.84TB U.2 in RAID-10: > $ pgbench -s 100 -c 64 -t 10000 pgbench > scale option ignored, using count from pgbench_branches table (100) > starting vacuum...end. > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 100 > query mode: simple > number of clients: 64 > number of threads: 1 > number of transactions per client: 10000 > number of transactions actually processed: 640000/640000 > latency average = 2.867 ms > tps = 22320.942063 (including connections establishing) > tps = 22326.370955 (excluding connections establishing) > > Sorry, wrong disks; this is correct: > 48 clients: > pgbench -s 100 -c 48 -t 10000 pgbench > scale option ignored, using count from pgbench_branches table (100) > starting vacuum...end. > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 100 > query mode: simple > number of clients: 48 > number of threads: 1 > number of transactions per client: 10000 > number of transactions actually processed: 480000/480000 > latency average = 1.608 ms > tps = 29846.511054 (including connections establishing) > tps = 29859.483666 (excluding connections establishing) > 64 clients: > pgbench -s 100 -c 64 -t 10000 pgbench > scale option ignored, using count from pgbench_branches table (100) > starting vacuum...end. > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 100 > query mode: simple > number of clients: 64 > number of threads: 1 > number of transactions per client: 10000 > number of transactions actually processed: 640000/640000 > latency average = 2.279 ms > tps = 28077.261708 (including connections establishing) > tps = 28085.730160 (excluding connections establishing) > If I'm doing the math properly, then these runs are very short (i.e about 20s). It would be interesting to specify a time limit (e.g -T600 or similar) so we see the effect of at least one checkpoint - i.e the disks are actually forced to write and sync the transaction data. These Micron disks look interesting (pretty good IOPS and lifetime numbers). However (as usual with Micron, sadly) no data about power off safety. Do you know if the the circuit board has capacitors? regards Mark
På fredag 11. mai 2018 kl. 14:11:39, skrev Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
On 11/05/18 23:23, Andreas Joseph Krogh wrote:
> På onsdag 09. mai 2018 kl. 22:00:16, skrev Andreas Joseph Krogh
> <andreas@visena.com <mailto:andreas@visena.com>>:
>
> På tirsdag 10. april 2018 kl. 19:41:59, skrev Craig James
> <cjames@emolecules.com <mailto:cjames@emolecules.com>>:
>
> On Tue, Apr 10, 2018 at 12:21 AM, Andreas Joseph Krogh
> <andreas@visena.com <mailto:andreas@visena.com>> wrote:
>
> På tirsdag 10. april 2018 kl. 04:36:27, skrev Craig James
> <cjames@emolecules.com <mailto:cjames@emolecules.com>>:
>
> One of our four "big iron" (spinning disks) servers
> went belly up today. (Thanks, Postgres and pgbackrest!
> Easy recovery.) We're planning to move to a cloud
> service at the end of the year, so bad timing on this.
> We didn't want to buy any more hardware, but now it
> looks like we have to.
> I followed the discussions about SSD drives when they
> were first becoming mainstream; at that time, the
> Intel devices were king. Can anyone recommend what's a
> good SSD configuration these days? I don't think we
> want to buy a new server with spinning disks.
> We're replacing:
> 8 core (Intel)
> 48GB memory
> 12-drive 7200 RPM 500GB
> RAID1 (2 disks, OS and WAL log)
> RAID10 (8 disks, postgres data dir)
> 2 spares
> Ubuntu 16.04
> Postgres 9.6
> The current system peaks at about 7000 TPS from pgbench.
>
> With what arguments (also initialization)?
>
> pgbench -i -s 100 -U test
> pgbench -U test -c ... -t ...
> -c -t TPS
> 5 20000 5202
> 10 10000 7916
> 20 5000 7924
> 30 3333 7270
> 40 2500 5020
> 50 2000 6417
>
> FWIW; We're testing
> this: https://www.supermicro.nl/products/system/1U/1029/SYS-1029U-TN10RT.cfm
> with 4 x Micron NVMe 9200 PRO NVMe 3.84TB U.2 in RAID-10:
> $ pgbench -s 100 -c 64 -t 10000 pgbench
> scale option ignored, using count from pgbench_branches table (100)
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 100
> query mode: simple
> number of clients: 64
> number of threads: 1
> number of transactions per client: 10000
> number of transactions actually processed: 640000/640000
> latency average = 2.867 ms
> tps = 22320.942063 (including connections establishing)
> tps = 22326.370955 (excluding connections establishing)
>
> Sorry, wrong disks; this is correct:
> 48 clients:
> pgbench -s 100 -c 48 -t 10000 pgbench
> scale option ignored, using count from pgbench_branches table (100)
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 100
> query mode: simple
> number of clients: 48
> number of threads: 1
> number of transactions per client: 10000
> number of transactions actually processed: 480000/480000
> latency average = 1.608 ms
> tps = 29846.511054 (including connections establishing)
> tps = 29859.483666 (excluding connections establishing)
> 64 clients:
> pgbench -s 100 -c 64 -t 10000 pgbench
> scale option ignored, using count from pgbench_branches table (100)
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 100
> query mode: simple
> number of clients: 64
> number of threads: 1
> number of transactions per client: 10000
> number of transactions actually processed: 640000/640000
> latency average = 2.279 ms
> tps = 28077.261708 (including connections establishing)
> tps = 28085.730160 (excluding connections establishing)
>
If I'm doing the math properly, then these runs are very short (i.e
about 20s). It would be interesting to specify a time limit (e.g -T600
or similar) so we see the effect of at least one checkpoint - i.e the
disks are actually forced to write and sync the transaction data.
These Micron disks look interesting (pretty good IOPS and lifetime
numbers). However (as usual with Micron, sadly) no data about power off
safety. Do you know if the the circuit board has capacitors?
regards
Mark
$ pgbench -s 100 -c 64 -T600 pgbench
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
duration: 600 s
number of transactions actually processed: 16979208
latency average = 2.262 ms
tps = 28298.582988 (including connections establishing)
tps = 28298.926331 (excluding connections establishing)
scale option ignored, using count from pgbench_branches table (100)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 64
number of threads: 1
duration: 600 s
number of transactions actually processed: 16979208
latency average = 2.262 ms
tps = 28298.582988 (including connections establishing)
tps = 28298.926331 (excluding connections establishing)
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
På fredag 11. mai 2018 kl. 14:11:39, skrev Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
[snip]
These Micron disks look interesting (pretty good IOPS and lifetime
numbers). However (as usual with Micron, sadly) no data about power off
safety. Do you know if the the circuit board has capacitors?
Don't know, sorry...
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On May 11, 2018, at 15:11, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:On 11/05/18 23:23, Andreas Joseph Krogh wrote:På onsdag 09. mai 2018 kl. 22:00:16, skrev Andreas Joseph Krogh <andreas@visena.com <mailto:andreas@visena.com>>:These Micron disks look interesting (pretty good IOPS and lifetime numbers). However (as usual with Micron, sadly) no data about power off safety. Do you know if the the circuit board has capacitors?
FWIW; We're testing
this: https://www.supermicro.nl/products/system/1U/1029/SYS-1029U-TN10RT.cfm
with 4 x Micron NVMe 9200 PRO NVMe 3.84TB U.2 in RAID-10:
The SSD supports an unexpected power loss with a power-backed write cache. No userdata is lost during an unexpected power loss. When power is subsequently restored, theSSD returns to a ready state within a maximum of 60 seconds.
On 12/05/18 02:48, Evgeniy Shishkin wrote: > > >> On May 11, 2018, at 15:11, Mark Kirkwood >> <mark.kirkwood@catalyst.net.nz >> <mailto:mark.kirkwood@catalyst.net.nz>> wrote: >> >> On 11/05/18 23:23, Andreas Joseph Krogh wrote: >> >>> På onsdag 09. mai 2018 kl. 22:00:16, skrev Andreas Joseph Krogh >>> <andreas@visena.com <mailto:andreas@visena.com> >>> <mailto:andreas@visena.com>>: >>> >>> FWIW; We're testing >>> this: >>> https://www.supermicro.nl/products/system/1U/1029/SYS-1029U-TN10RT.cfm >>> with 4 x Micron NVMe 9200 PRO NVMe 3.84TB U.2 in RAID-10: >> These Micron disks look interesting (pretty good IOPS and lifetime >> numbers). However (as usual with Micron, sadly) no data about power >> off safety. Do you know if the the circuit board has capacitors? > > According to > https://www.micron.com/~/media/documents/products/data-sheet/ssd/9200_u_2_pcie_ssd.pdf > <https://www.micron.com/%7E/media/documents/products/data-sheet/ssd/9200_u_2_pcie_ssd.pdf> > > The SSD supports an unexpected power loss with a power-backed write > cache. No userdata is lost during an unexpected power loss. When power > is subsequently restored, theSSD returns to a ready state within a > maximum of 60 seconds. Excellent, and thanks for finding the details - note the document explicitly states that they have capacitor backed power loss protection. So looking good as a viable alternative to Intel's S4500, S4600, P4500, P4600 range. One point to note - we've been here before with Micron claiming power loss protection and having to retract it later (Crucial M550 range...I have 2 of these BTW) - but to be fair to Micron the Crucial range is purely consumer and this Micron 9200 is obviously an enterprise targeted product. But some power loss testing might be advised! Cheers Mark