Re: pgbench - add pseudo-random permutation function - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: pgbench - add pseudo-random permutation function
Date
Msg-id alpine.DEB.2.21.1907230730150.7144@lancre
Whole thread Raw
In response to Re: pgbench - add pseudo-random permutation function  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: pgbench - add pseudo-random permutation function  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Hello Thomas,

>>> Function nbits(), which was previously discussed, has been simplified by
>>> using the function pg_popcount64().
>
> Hi Fabien, Suzuki-san,
>
> I am not smart enough to commit this

I'm in no hurry:-)

> or judge its value for benchmarking,

I think that it is valuable given that we have non uniform random 
generators in pgbench.

I'm wondering about the modular_multiply manual implementation which adds 
quite a few lines of non trivial code. If int128 is available on all/most 
platforms, it could be removed and marked as not supported on these, 
although it would create an issue with non regression tests.

> but I have a few trivial comments on the language:
>
> +    It allows to mix the output of non uniform random functions so that
>
> "It allows the output of non-uniform random functions to be mixed so that"

Fixed.

> +    ensures that a perfect permutation is applied: there are no collisions
> +    nor holes in the output values.
>
> "neither collisions nor holes", or "no collisions or holes"

I choose the first.

> +    The function errors if size is not positive.
>
> "raises an error"

Fixed.

> + * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
>
> "24 bit mega primes"

Fixed.

> +/* length of n binary representation */
> +static int
> +nbits(uint64 n)
> +{
> +    /* set lower bits to 1 and count them */
> +    return pg_popcount64(compute_mask(n));
> +}
>
> I suppose you could use n == 0 ? 0 : pg_leftmost_one_pos64(n) + 1, and then...

It would create a branch, that I'm trying to avoid.

> +/* return smallest mask holding n  */
> +static uint64
> +compute_mask(uint64 n)
> +{
> +    n |= n >> 1;
> +    n |= n >> 2;
> +    n |= n >> 4;
> +    n |= n >> 8;
> +    n |= n >> 16;
> +    n |= n >> 32;
> +    return n;
> +}
>
> ... here you could use 1 << nbits(n)) - 1.  I have no idea if doing it
> that way around is better, it's just a thought and removes a few lines
> of bit-swizzling code.

This would create a infinite recursion as nbits currently uses 
compute_mask. The 6 bitfield operation above is pretty efficient, I'd let 
it at that.

Attached a v16.

-- 
Fabien.
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add parallelism and glibc dependent only options to reindexdb
Next
From: Michael Paquier
Date:
Subject: Re: Race conditions with TAP test for syncrep