Re: Slim down integer formatting - Mailing list pgsql-hackers

From David Fetter
Subject Re: Slim down integer formatting
Date
Msg-id 20210728022542.GM18391@fetter.org
Whole thread Raw
In response to Re: Slim down integer formatting  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Slim down integer formatting  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Wed, Jul 28, 2021 at 01:17:43PM +1200, David Rowley wrote:
> On Wed, 28 Jul 2021 at 01:44, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > So how much faster is it than the original?
> 
> I only did some very quick tests.  They're a bit noisey. The results
> indicate an average speedup of 1.7%, but the noise level is above
> that, so unsure.
> 
> create table a (a int);
> insert into a select a from generate_series(1,1000000)a;
> vacuum freeze a;
> 
> bench.sql: copy a to '/dev/null';
> 
> master @ 93a0bf239
> drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 postgres
> latency average = 153.815 ms
> latency average = 152.955 ms
> latency average = 147.491 ms
> 
> master + v2 patch
> drowley@amd3990x:~$ pgbench -n -f bench.sql -T 60 postgres
> latency average = 144.749 ms
> latency average = 151.525 ms
> latency average = 150.392 ms

Thanks for testing this!  I got a few promising results early on with
-O0, and the technique seemed like a neat way to do things.

I generated a million int4s intended to be uniformly distributed
across the range of int4, and similarly across int8.

int4:
                   patch       6feebcb6b44631c3dc435e971bd80c2dd218a5ab
latency average:  362.149 ms     359.933 ms
latency stddev:      3.44 ms        3.40 ms

int8:
                   patch       6feebcb6b44631c3dc435e971bd80c2dd218a5ab
latency average:  434.944 ms     422.270 ms
latency stddev:      3.23 ms        4.02 ms

when compiled with -O2:

int4:
                   patch       6feebcb6b44631c3dc435e971bd80c2dd218a5ab
latency average: 167.262 ms     148.673 ms
latency stddev:    6.26  ms       1.28 ms

i.e. it was actually slower, at least over the 10 runs I did.

I assume that "uniform distribution across the range" is a bad case
scenario for ints, but I was a little surprised to measure worse
performance.  Interestingly, what I got for int8s generated to be
uniform across their range was

int8:
                   patch       6feebcb6b44631c3dc435e971bd80c2dd218a5ab
latency average: 171.737 ms     174.013 ms
latency stddev:    1.94  ms       6.84 ms

which doesn't look like a difference to me.

Intuitively, I'd expect us to get things in the neighborhood of 1 a
lot more often than things in the neighborhood of 1 << (30 or 60).  Do
we have some idea of the distribution, or at least of the distribution
family, that we should expect for ints?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Next
From: Tom Lane
Date:
Subject: Re: Out-of-memory error reports in libpq