Thread: benchmarking effective_io_concurrency
Hello, I recently spent a bit of time benchmarking effective_io_concurrency on Postgres. I would like to share my findings with you: https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/ Comments are welcome. regards, fabio pardi
On Mon, Jul 22, 2019 at 2:42 AM Fabio Pardi <f.pardi@portavita.eu> wrote:
Hello,
I recently spent a bit of time benchmarking effective_io_concurrency on Postgres.
I would like to share my findings with you:
https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/
Comments are welcome.
regards,
fabio pardi
You didn't mention what type of disk storage you are using, or if that matters. The number of cores in your database could also matter.
Does the max_parallel_workers setting have any influence on how effective_io_concurrency works?
Based on your data, one should set effective_io_concurrency at the highest possible setting with no ill effects with the possible exception that your disk will get busier. Somehow I suspect that as you scale the number of concurrent disk i/o tasks, other things may start to suffer. For example does CPU wait time start to increase as more and more threads are consumed waiting for i/o instead of doing other processing? Do you run into lock contention on the i/o subsystem? (Back in the day, lock contention for /dev/tcp was a major bottleneck for scaling busy webservers vertically. I have no idea if modern linux kernels could run into the same issue waiting for locks for /dev/sd0. Surely if anything was going to push that issue, it would be setting effective_io_concurrency really high and then demanding a lot of concurrent disk accesses.)
Hi Rick, thanks for your inputs. On 22/07/2019 14:06, Rick Otten wrote: > > > > You didn't mention what type of disk storage you are using, or if that matters. I actually mentioned I m using SSD, in RAID 10. Also is mentioned I tested in a no-RAID setup. Is that what you mean? The number of cores in your database could also matter. > True, when scaling I think it can actually bring up problems as you mention here below. (BTW, Tested on a VM with 6 coresand on HW with 32. I updated the blogpost, thanks) > Does the max_parallel_workers setting have any influence on how effective_io_concurrency works? > I m not sure about that one related to the tests I ran, because the query plan does not show parallelism. > Based on your data, one should set effective_io_concurrency at the highest possible setting with no ill effects with thepossible exception that your disk will get busier. Somehow I suspect that as you scale the number of concurrent diski/o tasks, other things may start to suffer. For example does CPU wait time start to increase as more and more threadsare consumed waiting for i/o instead of doing other processing? Do you run into lock contention on the i/o subsystem? (Back in the day, lock contention for /dev/tcp was a major bottleneck for scaling busy webservers vertically. I have no idea if modern linux kernels could run into the same issue waiting for locks for /dev/sd0. Surelyif anything was going to push that issue, it would be setting effective_io_concurrency really high and then demandinga lot of concurrent disk accesses.) > > > My suggestion would be to try by your own and find out what works for you, maybe slowly increasing the value of effective_io_concurrency. Every workload is peculiar, so I suspect there is no silver bullet here. Also the documentation gives you directions in thatway... regards, fabio pardi
On Mon, Jul 22, 2019 at 1:42 AM Fabio Pardi <f.pardi@portavita.eu> wrote: > > Hello, > > > I recently spent a bit of time benchmarking effective_io_concurrency on Postgres. > > I would like to share my findings with you: > > https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/ > > Comments are welcome. I did very similar test a few years back and came up with very similar results: https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTdQ@mail.gmail.com effective_io_concurrency is an oft overlooked tuning parameter and I'm curious if the underlying facility (posix_fadvise) can't be used for more types of queries. For ssd storage, which is increasingly common these days, it really pays of to crank it with few downsides from my measurement. merlin