On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote:
> Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> is now the winner on all my CPUs:
I thought it would be interesting to also measure the isolated effect
on just numeric_mul() without the query overhead.
Included var1ndigits=5 var2ndigits=5, that should be unaffected,
just to get a sense of the noise level.
SELECT timeit.h('numeric_mul',array['9999','9999'],2,min_time:='1 s'::interval);
SELECT timeit.h('numeric_mul',array['9999_9999','9999_9999'],2,min_time:='1 s'::interval);
SELECT timeit.h('numeric_mul',array['9999_9999_9999','9999_9999_9999'],2,min_time:='1 s'::interval);
SELECT timeit.h('numeric_mul',array['9999_9999_9999_9999','9999_9999_9999_9999'],2,min_time:='1 s'::interval);
SELECT timeit.h('numeric_mul',array['9999_9999_9999_9999_9999','9999_9999_9999_9999_9999'],2,min_time:='1
s'::interval);
CPU | var1ndigits | var2ndigits | HEAD | v7 | HEAD/v7
---------------------+-------------+-------------+-------+-------+---------
Apple M3 Max | 1 | 1 | 28 ns | 18 ns | 1.56
Apple M3 Max | 2 | 2 | 32 ns | 18 ns | 1.78
Apple M3 Max | 3 | 3 | 38 ns | 21 ns | 1.81
Apple M3 Max | 4 | 4 | 42 ns | 24 ns | 1.75
Intel Core i9-14900K | 1 | 1 | 25 ns | 20 ns | 1.25
Intel Core i9-14900K | 2 | 2 | 28 ns | 20 ns | 1.40
Intel Core i9-14900K | 3 | 3 | 33 ns | 24 ns | 1.38
Intel Core i9-14900K | 4 | 4 | 37 ns | 25 ns | 1.48
AMD Ryzen 9 7950X3D | 1 | 1 | 37 ns | 29 ns | 1.28
AMD Ryzen 9 7950X3D | 2 | 2 | 43 ns | 31 ns | 1.39
AMD Ryzen 9 7950X3D | 3 | 3 | 50 ns | 37 ns | 1.35
AMD Ryzen 9 7950X3D | 4 | 4 | 55 ns | 39 ns | 1.41
Impressive speed-up, between 25% - 81%.
Regards,
Joel