Re: rand48 replacement - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: rand48 replacement
Date
Msg-id alpine.DEB.2.22.394.2107041115130.2359988@pseudo
Whole thread Raw
In response to Re: rand48 replacement  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: rand48 replacement  (Dean Rasheed <dean.a.rasheed@gmail.com>)
List pgsql-hackers
Hello Dean,

>>   - moves the stuff to common and fully removes random/srandom (Tom)
>>   - includes a range generation function based on the bitmask method (Dean)
>>     but iterates with splitmix so that the state always advances once (Me)
>
> At the risk of repeating myself: do *not* invent your own scheme.

> The problem with iterating using splitmix is that splitmix is a simple
> shuffling function that takes a single input and returns a mutated
> output depending only on that input.

It also iterates over its 64 bits state in a round robin fashion so that 
the cycle size is 2^64 (it is a simple add).

> So let's say for simplicity that you're generating numbers in the range 
> [0,N) with N=2^64-n and n<2^63. Each of the n values in [N,2^64) that 
> lie outside the range wanted are just mapped in a deterministic way back 
> onto (at most) n values in the range [0,N), making those n values twice 
> as likely to be chosen as the other N-n values.

I do understand your point. If the value is outside the range, splitmix 
iterates over its seed and the extraction functions produces a new number 
which is tested again. I do not understand the "mapped back onto" part, 
the out of range value is just discarded and the loops starts over with a 
new derivation, and why it would imply that some values are more likely to 
come out.

> So what you've just invented is an algorithm with the complexity of the 
> unbiased bitmask method,

That is what I am trying to implement.

> but basically the same bias as the biased integer multiply method.

I did not understand.

> I don't understand why you object to advancing the state more than
> once. Doing so doesn't make the resulting sequence of numbers any less
> deterministic.

It does, somehow, hence my struggle to try to avoid it.

   call seed(0xdeadbeef);
   x1 = somepseudorand();
   x2 = somepseudorand();
   x3 = somepsuedorand();

I think we should want x3 to be the same result whatever the previous 
calls to the API.

> In fact, I'm pretty sure you *have to* advance the state more than
> once in *any* unbiased scheme. That's a common characteristic of all
> the unbiased methods I've seen, and intuitively, I think it has to be
> so.

Yes and no. We can advance another state seeded by the root prng.

> Otherwise, I'm happy with the use of the bitmask method, as long as
> it's implemented correctly.

I did not understand why it is not correct.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Remove useless int64 range checks on BIGINT sequence MINVALUE/MAXVALUE values
Next
From: David Rowley
Date:
Subject: Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM