Thread: benchmarking effective_io_concurrency

benchmarking effective_io_concurrency

From
Fabio Pardi
Date:
Hello,


I recently spent a bit of time benchmarking effective_io_concurrency on Postgres.

I would like to share my findings with you:

https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/

Comments are welcome.

regards,

fabio pardi



Re: benchmarking effective_io_concurrency

From
Rick Otten
Date:


On Mon, Jul 22, 2019 at 2:42 AM Fabio Pardi <f.pardi@portavita.eu> wrote:
Hello,


I recently spent a bit of time benchmarking effective_io_concurrency on Postgres.

I would like to share my findings with you:

https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/

Comments are welcome.

regards,

fabio pardi

You didn't mention what type of disk storage you are using, or if that matters.  The number of cores in your database could also matter.

Does the max_parallel_workers setting have any influence on how effective_io_concurrency works?

Based on your data, one should set effective_io_concurrency at the highest possible setting with no ill effects with the possible exception that your disk will get busier.  Somehow I suspect that as you scale the number of concurrent disk i/o tasks, other things may start to suffer.  For example does CPU wait time start to increase as more and more threads are consumed waiting for i/o instead of doing other processing?  Do you run into lock contention on the i/o subsystem?  (Back in the day, lock contention for /dev/tcp was a major bottleneck for scaling busy webservers vertically.  I have no idea if modern linux kernels could run into the same issue waiting for locks for /dev/sd0.  Surely if anything was going to push that issue, it would be setting effective_io_concurrency really high and then demanding a lot of concurrent disk accesses.)


 

Re: benchmarking effective_io_concurrency

From
Fabio Pardi
Date:
Hi Rick, 

thanks for your inputs.

On 22/07/2019 14:06, Rick Otten wrote:
> 
> 
> 
> You didn't mention what type of disk storage you are using, or if that matters. 

I actually mentioned I m using SSD, in RAID 10. Also is mentioned I tested in a no-RAID setup. Is that what you mean?

 The number of cores in your database could also matter.
> 

True, when scaling I think it can actually bring up problems as you mention here below. (BTW, Tested on a VM with 6
coresand on HW with 32. I updated the blogpost, thanks)
 


> Does the max_parallel_workers setting have any influence on how effective_io_concurrency works?
> 

I m not sure about that one related to the tests I ran, because the query plan does not show parallelism. 

> Based on your data, one should set effective_io_concurrency at the highest possible setting with no ill effects with
thepossible exception that your disk will get busier.  Somehow I suspect that as you scale the number of concurrent
diski/o tasks, other things may start to suffer.  For example does CPU wait time start to increase as more and more
threadsare consumed waiting for i/o instead of doing other processing?  Do you run into lock contention on the i/o
subsystem? (Back in the day, lock contention for /dev/tcp was a major bottleneck for scaling busy webservers
vertically. I have no idea if modern linux kernels could run into the same issue waiting for locks for /dev/sd0. 
Surelyif anything was going to push that issue, it would be setting effective_io_concurrency really high and then
demandinga lot of concurrent disk accesses.)
 
> 
> 
>  

My suggestion would be to try by your own and find out what works for you, maybe slowly increasing the value of
effective_io_concurrency.
 

Every workload is peculiar, so I suspect there is no silver bullet here. Also the documentation gives you directions in
thatway...
 



regards,

fabio pardi



Re: benchmarking effective_io_concurrency

From
Merlin Moncure
Date:
On Mon, Jul 22, 2019 at 1:42 AM Fabio Pardi <f.pardi@portavita.eu> wrote:
>
> Hello,
>
>
> I recently spent a bit of time benchmarking effective_io_concurrency on Postgres.
>
> I would like to share my findings with you:
>
> https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/
>
> Comments are welcome.

I did very similar test a few years back and came up with very similar results:
https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azsTdQ@mail.gmail.com

effective_io_concurrency is an oft overlooked tuning parameter and I'm
curious if the underlying facility (posix_fadvise) can't be used for
more types of queries.  For ssd storage, which is increasingly common
these days, it really pays of to crank it with few downsides from my
measurement.

merlin