Re: pgbench - add pseudo-random permutation function - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: pgbench - add pseudo-random permutation function
Date
Msg-id CAEZATCUMbvg0xOsi1+J1LCV0yz+r8MNESf4sEy++Z9V8GPEO3Q@mail.gmail.com
Whole thread Raw
In response to Re: pgbench - add pseudo-random permutation function  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: pgbench - add pseudo-random permutation function
List pgsql-hackers
On Wed, 31 Mar 2021 at 09:02, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
> >> First, I have a thing against erand48.
> >
> Also, there is a 64 bits seed provided to the function which instantly
> ignores 16 of them, which looks pretty silly to me.
>

Yeah, that was copied from set_random_seed().

> At least, I suggest that two 48-bits prng could be initialized with parts
> of the seed and used in different places, eg for r & m.
>

That could work. I'd certainly feel better about that than
implementing a whole new PRNG.

> Also, the seed could be used to adjust the rotation, maybe.
>

Perhaps. I'm not sure it's really necessary though.

> >> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
> >> values are kept out of STEERING. [...]
> >
> > Ah, that's a good point. Something else that also concerned me there was
> > that it might lead to 2 consecutive full shifts with nothing in between,
> > which would lead to less uniform randomness (like the Irwin-Hall
> > distribution). I just did a quick test without the first full shift, and
> > the results do appear to be better,
>
> Indeed, it makes sense to me.
>

OK, attached is an update making this change and simplifying the
rotate code, which hopefully just leaves the question of what (if
anything) to do with pg_erand48().

> >> Third, I think that the rotate code can be simplified, in particular
> >> the ?: should be avoided because it may induce branches quite damaging
> >> to processor performance.
> >
> > Yeah, I wondered about that. Perhaps there's a "trick" that can be
> > used to simplify it. Pre-computing the number of bits in the mask
> > would probably help.
>
> See pg_popcount64().
>

Actually, I used pg_leftmost_one_pos64() to calculate the mask length,
allowing the mask to be computed from that, so there is no longer a
need for compute_mask(), which seems like a neat little
simplification.

Regards,
Dean

Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: making update/delete of inheritance trees scale better
Next
From: Bruce Momjian
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?