Thread: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Hi,Please pardon me if this question is already answered in the documentation, Wiki, or the mailing list archive. The problem is, that I don't know the exact term to search for - I've tried searching for "linear scalability" and "concurrency vs performance" but didn't find what I was looking for.## MAIN QUESTIONpgbench -c 1 achieves approx 80 TPSpgbench -c 6 should achieve approx 480 TPS, but only achieves 360 TPSpgbench -c 12, should achieve approx 960 TPS, but only achieves 610 TPSIf pgbench is being run on a 4c/8t machine and pg-server is being run on a 6c/12t machine with 32GB RAM [1], and the two servers are connected with 1 Gbit/s connection, I don't think either pgbench or pg-server is being constrained by hardware, right?In that case why is it not possible to achieve linear scalability, at least till 12 concurrent connections (i.e. the thread-count of pg-server)? What is an easy way to identify the limiting factor? Is it network connectivity? Disk IOPS? CPU load? Some config parameter?## SECONDARY QUESTIONAt what level of concurrent connections should settings like shared_buffers, effective_cache_size, max_wal_size start making a difference? With my hardware [1], I'm seeing a difference only after 48 concurrent connections. And that too it's just a 15-30% improvement over the default settings that ship with the Ubuntu 18.04 package. Is this expected? Isn't this allocating too many resources for too little gain?## CONTEXTI am currently trying to benchmark PG 11 (via pgbench) to figure out the configuration parameters that deliver optimum performance for my hardware [1] and workload [2]Based on https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server I've made the following relevant changes to the default PG config on Ubuntu 18.04:max_connection=400work_mem=4MBmaintenance_work_mem=64MBshared_buffers=12288MBtemp_buffers=8MBeffective_cache_size=16GBwal_buffers=-1wal_sync_method=fsyncmax_wal_size=5GBautovacuum=off # NOTE: Only for benchmarking[1] 32 GB RAM - 6 core/12 thread - 2x SSD in RAID1[2] SaaS webapp -- it's a mixed workload which looks a lot like TPC-BThanks,Saurabh.
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
wal_sync_method=fsync
Why this change?
PGOPTIONS="-c synchronous_commit=off" pgbench -T 3600 -P 10 ....
You could also try pg_test_fsync to get low-level information, to supplement the high level you get from pgbench.
The effects of max_wal_size are going to depend on how you have IO configured, for example does pg_wal shared the same devices and controllers as the base data? It is mostly about controlling disk usage and crash-recovery performance, neither of which is of primary importance to pgbench performance.
Not all SSD are created equal, so the details here matter, both for the underlying drives and the raid controller.
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
PGOPTIONS="-c synchronous_commit=off" pgbench -T 3600 -P 10 ....I am currently running all my benchmarks with synchronous_commit=off and will get back with my findings.
+--------+---------------------+------------------------+
| client | Mostly defaults [1] | Optimised settings [2] |
+--------+---------------------+------------------------+
| 1 | 80-86 | 169-180 |
+--------+---------------------+------------------------+
| 6 | 350-376 | 1265-1397 |
+--------+---------------------+------------------------+
| 12 | 603-619 | 1746-2352 |
+--------+---------------------+------------------------+
| 24 | 947-1015 | 1869-2518 |
+--------+---------------------+------------------------+
| 48 | 1435-1512 | 1912-2818 |
+--------+---------------------+------------------------+
| 96 | 1769-1811 | 1546-1753 |
+--------+---------------------+------------------------+
| 192 | 1857-1992 | 1332-1508 |
+--------+---------------------+------------------------+
| 384 | 1667-1793 | 1356-1450 |
+--------+---------------------+------------------------+
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
On Sun, Jan 27, 2019 at 01:09:16PM +0530, Saurabh Nanda wrote: > It seems that PGOPTIONS="-c synchronous_commit=off" has a significant > impact. However, I still can not understand why the TPS for the optimised > case is LOWER than the default for higher concurrency levels! Do you know which of the settings is causing lower TPS ? I suggest to check shared_buffers. If you haven't done it, disabling THP and KSM can resolve performance issues, esp. with large RAM like shared_buffers, at least with older kernels. https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com Justin
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Do you know which of the settings is causing lower TPS ?
I suggest to check shared_buffers.
If you haven't done it, disabling THP and KSM can resolve performance issues,
esp. with large RAM like shared_buffers, at least with older kernels.
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
You could also try pg_test_fsync to get low-level information, to supplement the high level you get from pgbench.Thanks for pointing me to this tool. never knew pg_test_fsync existed! I've run `pg_test_fsync -s 60` two times and this is the output - https://gist.github.com/saurabhnanda/b60e8cf69032b570c5b554eb50df64f8 I'm not sure what to make of it?
The effects of max_wal_size are going to depend on how you have IO configured, for example does pg_wal shared the same devices and controllers as the base data? It is mostly about controlling disk usage and crash-recovery performance, neither of which is of primary importance to pgbench performance.The WAL and the data-directory reside on the same SSD disk -- is this a bad idea?
I was under the impression that smaller values for max_wal_size cause pg-server to do "maintenance work" related to wal rotation, etc. more frequently and would lead to lower pgbench performance.
PS: Cc-ing the list back again because I assume you didn't intend for your reply to be private, right?
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
PGOPTIONS="-c synchronous_commit=off" pgbench -T 3600 -P 10 ....I am currently running all my benchmarks with synchronous_commit=off and will get back with my findings.It seems that PGOPTIONS="-c synchronous_commit=off" has a significant impact.
> However, I still can not understand why the TPS for the optimised case is LOWER than the default for higher concurrency levels!
> +--------+---------------------+------------------------+
> | client | Mostly defaults [1] | Optimised settings [2] |
> +--------+---------------------+------------------------+
> | 1 | 80-86 | 169-180 |
> +--------+---------------------+------------------------+
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
It is usually not acceptable to run applications with synchronous_commit=off, so once you have identified that the bottleneck is in implementing synchronous_commit=on, you probably need to take a deep dive into your hardware to figure out why it isn't performing the way you need/want/expect it to. Tuning the server under synchronous_commit=off when you don't intend to run your production server with that setting is unlikely to be fruitful.
+--------+-----------------------------------------------------------------+------------------------+
| | synchronous_commit=on | synchronous_commit=off |
+--------+-----------------------------------------------------------------+------------------------+
| client | Mostly defaults [1] | Optimised [2] | Optimised [2] | Optimised [2] |
| | | + shared_buffers=2G | + shared_buffers=8G | + shared_buffers=8G |
+--------+---------------------+---------------------+---------------------+------------------------+
| 1 | 80-86 | 74-77 | 75-75 | 169-180 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 6 | 350-376 | 301-304 | 295-300 | 1265-1397 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 12 | 603-619 | 476-488 | 485-493 | 1746-2352 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 24 | 947-1015 | 678-739 | 723-770 | 1869-2518 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 48 | 1435-1512 | 950-1043 | 1029-1086 | 1912-2818 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 96 | 1769-1811 | 3337-3459 | 1302-1346 | 1546-1753 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 192 | 1857-1992 | 3613-3715 | 1269-1345 | 1332-1508 |
+--------+---------------------+---------------------+---------------------+------------------------+
| 384 | 1667-1793 | 3180-3300 | 1262-1364 | 1356-1450 |
+--------+---------------------+---------------------+---------------------+------------------------+
In case you do intend to run with synchronous_commit=off, or if you are just curious: running with a very high number of active connections often reveals subtle bottlenecks and interactions, and is very dependent on your hardware. Unless you actually intend to run our server with synchronous_commit=off and with a large number of active connections, it is probably not worth investigating this.
I'm more interested in the low end, you should do much better than those reported numbers when clients=1 and synchronous_commit=off with the data on SSD. I think you said that pgbench is running on a different machine than the database, so perhaps it is just network overhead that is keeping this value down. What happens if you run them on the same machine?
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
+--------+------------------------------------------------+-----------------+
| | synchronous_commit=on | sync_commit=off |
+--------+------------------------------------------------+-----------------+
| client | Defaults [1] | Optimised [2] | Optimised [2] | Optimised [2] |
| | | (buffers=2G) | (buffers=8G) | (buffers=8G) |
+--------+--------------+----------------+----------------+-----------------+
| 1 | 80-86 | 74-77 | 75-75 | 169-180 |
+--------+--------------+----------------+----------------+-----------------+
| 6 | 350-376 | 301-304 | 295-300 | 1265-1397 |
+--------+--------------+----------------+----------------+-----------------+
| 12 | 603-619 | 476-488 | 485-493 | 1746-2352 |
+--------+--------------+----------------+----------------+-----------------+
| 24 | 947-1015 | 678-739 | 723-770 | 1869-2518 |
+--------+--------------+----------------+----------------+-----------------+
| 48 | 1435-1512 | 950-1043 | 1029-1086 | 1912-2818 |
+--------+--------------+----------------+----------------+-----------------+
| 96 | 1769-1811 | 3337-3459 | 1302-1346 | 1546-1753 |
+--------+--------------+----------------+----------------+-----------------+
| 192 | 1857-1992 | 3613-3715 | 1269-1345 | 1332-1508 |
+--------+--------------+----------------+----------------+-----------------+
| 384 | 1667-1793 | 3180-3300 | 1262-1364 | 1356-1450 |
+--------+--------------+----------------+----------------+-----------------+
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Le 28/01/2019 à 15:03, Saurabh Nanda a écrit : > An update. It seems (to my untrained eye) that something is wrong with > the second SSD in the RAID configuration. Here's my question on > serverfault related to what I saw with iostat - > https://serverfault.com/questions/951096/difference-in-utilisation-reported-by-iostat-for-two-identical-disks-in-raid1 > > I've disabled RAID and rebooted the server to run the benchmarks with > client=1,4,8,12 with shared_buffers=8MB (default) vs > shared_buffers=2GB (optimised?) and will report back. > > You should probably include the detailed hardware you are working on - especially for the SSD, the model can have a big impact, as well as its wear. Nicolas
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
You should probably include the detailed hardware you are working on -
especially for the SSD, the model can have a big impact, as well as its
wear.
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
You should probably include the detailed hardware you are working on -
especially for the SSD, the model can have a big impact, as well as its
wear.What's the best tool to get meaningful information for SSD drives?
smartctl is a good start
Nicolas
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
smartctl is a good start
+--------+--------------+----------------+
| client | Defaults [1] | buffers=2G [2] |
+--------+--------------+----------------+
| 1 | 348-475 (??) | 529-583 (??) |
+--------+--------------+----------------+
| 4 | 436-452 | 451-452 |
+--------+--------------+----------------+
| 8 | 862-869 | 859-861 |
+--------+--------------+----------------+
| 12 | 1210-1219 | 1220-1225 |
+--------+--------------+----------------+
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
Do you know which of the settings is causing lower TPS ?
I suggest to check shared_buffers.
If you haven't done it, disabling THP and KSM can resolve performance issues,
esp. with large RAM like shared_buffers, at least with older kernels.
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com
+--------+--------------+----------------+---------------+
| client | Defaults [1] | buffers=2G [2] | buffers=2G |
| | | | huge_pages=on |
+--------+--------------+----------------+---------------+
| 1 | 348-475 (??) | 529-583 (??) | 155-290 |
+--------+--------------+----------------+---------------+
| 4 | 436-452 | 451-452 | 388-403 |
+--------+--------------+----------------+---------------+
| 8 | 862-869 | 859-861 | 778-781 |
+--------+--------------+----------------+---------------+
| 12 | 1210-1219 | 1220-1225 | 1110-1111 |
+--------+--------------+----------------+---------------+
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
+--------+--------------+------------------+
| client | Hetzner | E2E Networks |
| | EX41-SSD [1] | Cloud Server [2] |
+--------+--------------+------------------+
| 1 | ~160 | ~400 |
+--------+--------------+------------------+
| 4 | ~460 | ~1450 |
+--------+--------------+------------------+
| 8 | ~850 | ~2600 |
+--------+--------------+------------------+
| 12 | ~1200 | ~4000 |
+--------+--------------+------------------+
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
c) I tried another cloud hosting provider (E2E Networks) and just the raw performance numbers (with default configuration) are blowing Hetzner out of the water.
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
c) I tried another cloud hosting provider (E2E Networks) and just the raw performance numbers (with default configuration) are blowing Hetzner out of the water.I noticed that on E2E, the root filesystem is mounted with the following options:/dev/xvda on / type ext4(rw,noatime,nodiratime,nobarrier,errors=remount-ro,stripe=512,data=ordered)whereas on Hetzner, it is mounted with the following options:/dev/nvme0n1p3 on / type ext4(rw,relatime,data=ordered)How much of a difference can this have on absolute TPS numbers?
Differences can be significative. noatime does not update inode access time, while relatime updates the inode access time if the change time was before access time (which can be often the case for a database)
nobarrier disable block-level write barriers. Barriers ensure that data is effectively stored on system, The man command says: "If disabled on a device with a volatile (non-battery-backed) write-back cache, the nobarrier option will lead to filesystem corruption on a system crash or power loss."
You should probably consider noatime compared to relatime, and nobarriers depends if you have a battery or not
Also, this is an SSD, so you should TRIM it, either with preiodical fstrim, or using the discard option
Nicolas
Re: Benchmarking: How to identify bottleneck (limiting factor) andachieve "linear scalability"?
All this benchmarking has led me to a philosophical question, why does PG need shared_buffers in the first place?
What's wrong with letting the OS do the caching/buffering?
Isn't it optimised for this kind of stuff?