Re: pgbench - add pseudo-random permutation function - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: pgbench - add pseudo-random permutation function
Date
Msg-id alpine.DEB.2.22.394.2103310849380.411399@pseudo
Whole thread Raw
In response to Re: pgbench - add pseudo-random permutation function  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: pgbench - add pseudo-random permutation function
List pgsql-hackers
Hello Dean,

>> First, I have a thing against erand48.
>
> Yeah, that's probably a fair point. However, all the existing pgbench 
> random functions are using it, so I think it's fair enough for permute() 
> to do the same (and actually 2^48 is pretty huge). Switching to a 64-bit 
> PRNG might not be a bad idea, but I think that's something we'd want to 
> do across the board, and so I think it should be out of scope for this 
> patch.

But less likely to pass, whereas here we have an internal function that 
we can set as we want.

Also, there is a 64 bits seed provided to the function which instantly 
ignores 16 of them, which looks pretty silly to me.

Also, the function is named everywhere erand48 with its hardcoded int16[3] 
state, which makes a poor abstraction.

At least, I suggest that two 48-bits prng could be initialized with parts 
of the seed and used in different places, eg for r & m.

Also, the seed could be used to adjust the rotation, maybe.

>> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
>> values are kept out of STEERING. [...]
>
> Ah, that's a good point. Something else that also concerned me there was 
> that it might lead to 2 consecutive full shifts with nothing in between, 
> which would lead to less uniform randomness (like the Irwin-Hall 
> distribution). I just did a quick test without the first full shift, and 
> the results do appear to be better,

Indeed, it makes sense to me.

> so removing that looks like a good idea.

>> Third, I think that the rotate code can be simplified, in particular 
>> the ?: should be avoided because it may induce branches quite damaging 
>> to processor performance.
>
> Yeah, I wondered about that. Perhaps there's a "trick" that can be
> used to simplify it. Pre-computing the number of bits in the mask
> would probably help.

See pg_popcount64().

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Proposal: Save user's original authenticated identity for logging
Next
From: Sait Talha Nisanci
Date:
Subject: Crash in record_type_typmod_compare