Re: rand48 replacement - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: rand48 replacement |
Date | |
Msg-id | alpine.DEB.2.22.394.2111271951200.223291@pseudo Whole thread Raw |
In response to | Re: rand48 replacement (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: rand48 replacement
|
List | pgsql-hackers |
Hello Tom, Thanks for the feedback. > +/* use Donald Knuth's LCG constants for default state */ > > How did Knuth get into this? This algorithm is certainly not his, > so why are those constants at all relevant? They are not more nor less relevant than any other "random" constant. The state needs a default initialization. The point of using DK's is that it is somehow cannot be some specially crafted value which would have some special property only know to the purveyor of the constant and could be used by them to break the algorithm. https://en.wikipedia.org/wiki/Dual_EC_DRBG > * I could do without the stream-of-consciousness notes in pg_prng.c. > I think what's appropriate is to say "We use thus-and-such a generator > which is documented here", maybe with a line or two about its properties. The stuff was really written essentially as a "why this" for the first patch, and to prevent questions about "why not this other generator" later, because it could never stop. > * Function names like these convey practically nothing to readers: > > +extern int64 pg_prng_i64(pg_prng_state *state); > +extern uint32 pg_prng_u32(pg_prng_state *state); > +extern int32 pg_prng_i32(pg_prng_state *state); > +extern double pg_prng_f64(pg_prng_state *state); > +extern bool pg_prng_bool(pg_prng_state *state); The intention is obviously "postgres pseudo-random number generator for <type>". ISTM that it conveys (1) that it is a postgres-specific stuff, (2) that it is a PRNG (which I find *MUCH* more informative than the misleading statement that something is random when it is not, and it is shorter) and (3) about the type it returns, because C does require functions to have distinct names. What would you suggest? > and these functions' header comments add a grand total of zero bits > of information. Yes, probably. I do not like not to comment at all on a function. > What someone generally wants to know first about a PRNG is (a) is it > uniform and (b) what is the range of outputs, neither of which are > specified anywhere. ISTM (b) is suggested thanks to the type and (a) I'm not sure about a PRNG which would claim not at least claim to be uniform. Non uniform PRNG are usually built based on a uniform one. What do you suggest as alternate names? > +#define FIRST_BIT_MASK UINT64CONST(0x8000000000000000) > +#define RIGHT_HALF_MASK UINT64CONST(0x00000000FFFFFFFF) > +#define DMANTISSA_MASK UINT64CONST(0x000FFFFFFFFFFFFF) > > I'm not sure that these named constants are any more readable than > writing the underlying constant, maybe less so --- in particular > I think something based on (1<<52)-1 would be more appropriate for > the float mantissa operations. We don't need RIGHT_HALF_MASK at > all, the casts to uint32 or int32 will accomplish that just fine. Yep. I did it for uniformity. > BTW, why are we bothering with FIRST_BIT_MASK in the first place, > rather than returning "v & 1" for pg_prng_bool? Because some PRNG are very bad in the low bits, not xoroshiro stuff, though. > Is xoroshiro128ss less random in the low-order bits than the higher? > If so, that would be a pretty important thing to document. If it's not, > we shouldn't make the code look like it is. Dunno. Why should we prefer low bits? > + * select in a range with bitmask rejection. > > What is "bitmask rejection"? Is it actually important to callers? No, it is important to understand how it does it. That is the name of the technique which is implemented, which helps if you want to understand what is going on by googling it. This point could be moved inside the function. > I think this should be documented more like "Produce a random > integer uniformly selected from the range [rmin, rmax)." Sure. -- Fabien.
pgsql-hackers by date: