Hi, Pavel!
On Fri, Nov 11, 2022 at 2:40 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> I've done some more measurements to check the hypotheses regarding the
> performance of a previous patch v2, and explain the results of tests
> in [1].
>
> The results below are the same (tps vs connections) plots as in [1],
> and the test is identical to the insert test in this thread [2].
> Additionally, in each case, there is a plot with results relative to
> Andres Freund's patch [3]. Log plots are good for seeing details in
> the range of 20-30 connections, but they somewhat hide the fact that
> the effect in the range of 500+ connections is much more significant
> overall, so I'd recommend looking at the linear plots as well.
Thank you for doing all the experiments!
BTW, sometimes it's hard to distinguish so many lines on a jpg
picture. Could I ask you to post the same graphs in png and also post
raw data in csv format?
> I'm also planning to do the same tests on an ARM server when the free
> one comes available to me.
> Thoughts?
ARM tests should be great. We definitely need to check this on more
than just one architecture. Please, check with and without LSE
instructions. They could lead to dramatic speedup [1]. Although,
most of precompiled binaries are distributed without them. So, both
cases seems important to me so far.
From what we have so far, I think we could try combine the multiple
strategies to achieve the best result. 2x1ms is one of the leaders
before ~200 connections, and 1x1ms is once of the leaders after. We
could implement simple heuristics to switch between 1 and 2 retries
similar to what we have to spin delays. But let's have ARM results
first.
Links
1. https://akorotkov.github.io/blog/2021/04/30/arm/
------
Regards,
Alexander Korotkov