Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Date
Msg-id 8e909218-1965-4515-99e3-4bb5b625e004@app.fastmail.com
Whole thread Raw
In response to Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
List pgsql-hackers
On Tue, Jul 2, 2024, at 00:19, Dean Rasheed wrote:
> I had a play with this, and came up with a slightly different way of
> doing it that works for var2 of any size, as long as var1 is just 1 or
> 2 digits.
>
> Repeating your benchmark where both numbers have up to 2 NBASE-digits,
> this new approach was slightly faster:
>
...
>
> (This was on an older Intel Core i9-9900K, so I'm not sure why all the
> timings are faster. What compiler settings are you using?)

Strange. I just did `./configure` with a --prefix.

Compiler settings on my Intel Core i9-14900K machine:

$ pg_config | grep -E '^(CC|CFLAGS|CPPFLAGS|LDFLAGS)'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute-Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security
-fno-strict-aliasing-fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O2
 
CFLAGS_SL = -fPIC
LDFLAGS = -Wl,--as-needed -Wl,-rpath,'/home/joel/pg-dev/lib',--enable-new-dtags
LDFLAGS_EX =
LDFLAGS_SL =

> The approach taken in this patch only uses 32-bit integers, so in
> theory it could be extended to work for var1ndigits = 3, 4, or even
> more, but the code would get increasingly complex, and I ran out of
> steam at 2 digits. It might be worth trying though.
>
> Regards,
> Dean
>
> Attachments:
> * optimize-numeric-mul_var-small-var1-arbitrary-var2.patch.txt

Really nice!

I've benchmarked your patch on my three machines with great results.
I added a setseed() step, to make the benchmarks reproducible,
shouldn't matter much since it should statistically average out, but I thought why not.

CREATE TABLE bench_mul_var (num1 numeric, num2 numeric);
SELECT setseed(0.12345);
INSERT INTO bench_mul_var (num1, num2)
SELECT random(0::numeric,1e8::numeric), random(0::numeric,1e8::numeric) FROM generate_series(1,1e8);
\timing

/*
 * Apple M3 Max
 */

SELECT SUM(num1*num2) FROM bench_mul_var; -- HEAD
Time: 3622.342 ms (00:03.622)
Time: 3029.786 ms (00:03.030)
Time: 3046.497 ms (00:03.046)
Time: 3035.910 ms (00:03.036)
Time: 3034.073 ms (00:03.034)

SELECT SUM(num1*num2) FROM bench_mul_var; -- optimize-numeric-mul_var-small-var1-arbitrary-var2.patch.txt
Time: 2484.685 ms (00:02.485)
Time: 2478.341 ms (00:02.478)
Time: 2494.397 ms (00:02.494)
Time: 2470.987 ms (00:02.471)
Time: 2490.215 ms (00:02.490)

/*
 * Intel Core i9-14900K
 */

SELECT SUM(num1*num2) FROM bench_mul_var; -- HEAD
Time: 2555.569 ms (00:02.556)
Time: 2523.145 ms (00:02.523)
Time: 2518.671 ms (00:02.519)
Time: 2514.501 ms (00:02.515)
Time: 2516.919 ms (00:02.517)

SELECT SUM(num1*num2) FROM bench_mul_var; -- optimize-numeric-mul_var-small-var1-arbitrary-var2.patch.txt
Time: 2246.441 ms (00:02.246)
Time: 2243.900 ms (00:02.244)
Time: 2245.350 ms (00:02.245)
Time: 2245.080 ms (00:02.245)
Time: 2247.856 ms (00:02.248)

/*
 * AMD Ryzen 9 7950X3D
 */

SELECT SUM(num1*num2) FROM bench_mul_var; -- HEAD
Time: 3037.497 ms (00:03.037)
Time: 3010.037 ms (00:03.010)
Time: 3000.956 ms (00:03.001)
Time: 2989.424 ms (00:02.989)
Time: 2984.911 ms (00:02.985)

SELECT SUM(num1*num2) FROM bench_mul_var; -- optimize-numeric-mul_var-small-var1-arbitrary-var2.patch.txt
Time: 2645.530 ms (00:02.646)
Time: 2640.472 ms (00:02.640)
Time: 2638.613 ms (00:02.639)
Time: 2637.889 ms (00:02.638)
Time: 2638.054 ms (00:02.638)

/Joel



pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Add has_large_object_privilege function
Next
From: Dean Rasheed
Date:
Subject: Re: gamma() and lgamma() functions