Re: Improving and extending int128.h to more of numeric.c - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Improving and extending int128.h to more of numeric.c
Date
Msg-id CAEZATCWTqJN4Zg5xXffi7K3BU5i2SFoMnZJ9FHvzms7bPCnfrA@mail.gmail.com
Whole thread Raw
In response to Re: Improving and extending int128.h to more of numeric.c  (John Naylor <johncnaylorls@gmail.com>)
Responses Re: Improving and extending int128.h to more of numeric.c
List pgsql-hackers
On Wed, 16 Jul 2025 at 10:02, John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Mon, Jun 23, 2025 at 3:01 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> > 0005 is the main patch. It adds a few more functions to int128.h and
> > uses them in numeric.c to allow various functions (mainly aggregate
> > functions) to use 128-bit integers unconditionally on all platforms.
> > This applies to the following aggregates:
> >
> > - sum(int8)
> > - avg(int8)
> > - stddev_pop(int4)
> > - stddev_samp(int4)
> > - var_pop(int4)
> > - var_samp(int4)
>
> > Testing on a 32-bit system without native int128 support, I see
> > something like a 1.3-1.5x speedup in a couple of simple queries using
> > those aggregates.
>
> With v5, I don't see any difference from master when building on x86
> with -m32 for these queries:

Thanks for testing!

> select sum(i) from generate_series(1e10, 1e10+1e6, 1) i;
> select var_pop(i) from generate_series(1e9, 1e9+1e6, 1) i;

The patch won't make any difference to those because "i" is numeric in
those queries, and the patch doesn't touch sum(numeric) or
var_pop(numeric).

> Which queries were you testing?

I used the following 2 queries:

SELECT count(*), sum(x), avg(x)
  FROM generate_series(1::bigint, 10000000::bigint) g(x);

SELECT count(*), var_pop(x), var_samp(x), stddev_pop(x), stddev_samp(x)
  FROM generate_series(1::int, 10000000::int) g(x);

On 64-bit Linux with gcc 14.2 and native int128 support disabled I got
the following results:

Query 1:
  HEAD:  1404.096 ms
  Patch:  992.818 ms

Query 2:
  HEAD:  1498.949 ms
  Patch:  935.654 ms

And on a 32-bit Linux VM I got:

Query 1:
  HEAD:  2465.202 ms
  Patch: 1874.590 ms
Query 2:
  HEAD:  2491.991 ms
  Patch: 1682.992 ms

I didn't originally try "-m32" on 64-bit Linux because I wasn't sure
how realistic a test that would be, but doing that now I get:

Query 1:
  HEAD:  1830.652 ms
  Patch: 1411.438 ms
Query 2:
  HEAD:  1882.299 ms
  Patch: 1299.546 ms

> (Also, unrelated to the patch set, but I was surprised to find
> replacing the numeric expressions above with bigint ones
> (10_000_000_000 etc) makes the queries at least 5 times slower, and
> that's true with a normal 64-bit build as well.)

Hmm, are you sure? I don't see that. With -m32, I see:

select sum(i) from generate_series(1e10, 1e10+1e6, 1) i;
  HEAD:  204.774 ms
  Patch: 204.206 ms
(not expecting any difference)

select sum(i) from generate_series(10_000_000_000, 10_001_000_000, 1) i;
  HEAD:  187.426 ms
  Patch: 140.741 ms
(as expected, faster than the previous query in HEAD because bigint
generate_series should be faster than numeric generate_series, and
faster still with the sum(bigint) optimisations made by this patch)

select var_pop(i) from generate_series(1e9, 1e9+1e6, 1) i;
  HEAD:  228.386 ms
  Patch: 226.712 ms
(not expecting any difference)

select var_pop(i) from generate_series(10_000_000_000, 10_001_000_000, 1) i;
  HEAD:  211.749 ms
  Patch: 210.870 ms
(as expected, faster than previous query because of bigint
generate_series, but the patch makes no difference because it doesn't
touch var_pop(bigint))

And another query:

select sum(i::bigint) from generate_series(1e10, 1e10+1e6, 1) i;
  HEAD:  271.888 ms
  Patch: 227.898 ms
(as expected, slower than the pure numeric version in HEAD because of
the cast, while still using numeric in the aggregate, but with a
decent speedup from the patch, using INT128 in the aggregate)

Regards,
Dean



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: index prefetching
Next
From: Andres Freund
Date:
Subject: Re: index prefetching