Em ter., 13 de jul. de 2021 às 14:42, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em ter., 13 de jul. de 2021 às 09:44, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em ter., 13 de jul. de 2021 às 09:24, David Rowley <dgrowleyml@gmail.com> escreveu:
On Wed, 14 Jul 2021 at 00:06, Ranier Vilela <ranier.vf@gmail.com> wrote: > > Em ter., 13 de jul. de 2021 às 04:19, Ronan Dunklau <ronan.dunklau@aiven.io> escreveu: >> I would be >> surprised the check adds that much to the whole execution though. > > I think this branch is a misprediction.
It could be. I wondered that myself when I saw Ronan's results were better than mine for 2,4 and 7. However, I think Ronan had quite a bit of noise in his results as there's no reason for the speedup in tests 2,4 and 7.
> In most cases is it not datumSort?
who knows. Maybe someone's workload always requires the datum sort.
> That's why I would like to use unlikely.
We really only use unlikely() in cases where we want to move code out of line to a cold area because it's really never executed under normal circumstances. We tend to do that for ERROR cases as we don't ever really want to optimise for errors. We also sometimes do it when some function has a branch to initialise something during the first call. The case in question here does not fit for either of those two cases.
Hum, I understand the usage cases now.
Thanks for the hint.
> IMO all the tests should all be to verify past behavior first.
I'm not quire sure what you mean there.
I'm saying we could help the branch by keeping the same testing logic as before and not reversing it. Attached is a version to demonstrate this, I don't pretend to be v7.
I couldn't find a good phrase to the contrary:
"are we *not* using the single value optimization ?"
I don't have time to take the tests right now.
Finally I had time to benchmark (David's benchsort.sh)
ubuntu 64 bits (20.04) 8gb ram SSD 256GB.
Table with the best results of each.
HEAD
v6
v7
v7b
v6 vs master
v7 vs v6
v7b vs v6
Test1
288,149636
449,018541
469,757169
550,48505
155,83%
104,62%
122,60%
Test2
94,766955
95,451406
94,556249
94,718982
100,72%
99,06%
99,23%
Test3
190,521319
260,279802
259,597067
278,115296
136,61%
99,74%
106,85%
Test4
78,779344
78,253455
78,114068
77,941482
99,33%
99,82%
99,60%
Test5
131,362614
142,662223
136,436347
149,639041
108,60%
95,64%
104,89%
Test6
112,884298
124,181671
115,528328
127,58497
110,01%
93,03%
102,74%
Test7
69,308587
68,643067
66,10195
69,087544
99,04%
96,30%
100,65%
Test8
243,674171
364,681142
371,928453
419,259703
149,66%
101,99%
114,97%
I have no idea why v7 failed with test6?
v6 slowdown with test4 and test7.
v7b slowdown with test2 and test4, in relation with v7.
v7b slowdown with test2 and test4, in relation with *v6*.
If field struct datumSort is not absolutely necessary, I think that v7 will be better.