Re: rand48 replacement - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: rand48 replacement |
Date | |
Msg-id | alpine.DEB.2.22.394.2107031643010.2359988@pseudo Whole thread Raw |
In response to | Re: rand48 replacement (Yura Sokolov <y.sokolov@postgrespro.ru>) |
List | pgsql-hackers |
Hello Yura, > 1. PostgreSQL source uses `uint64` and `uint32`, but not > `uint64_t`/`uint32_t` > 2. I don't see why pg_prng_state could not be `typedef uint64 > pg_prng_state[2];` It could, but I do not see that as desirable. From an API design point of view we want something clean and abstract, and for me a struct looks better for that. It would be a struct with an array insided, I think that the code is more readable by avoiding constant index accesses (s[0] vs s0), we do not need actual indexes. > 3. Then SamplerRandomState and pgbench RandomState could stay. > Patch will be a lot shorter. You mean "typedef pg_prng_state SamplerRandomState"? One point of the patch is to have "one" standard PRNG commonly used where one is needed, so I'd say we want the name to be used, hence the substitutions. Also, I have a thing against objects being named "Random" which are not random, which is highly misleading. A PRNG is purely deterministic. Removing misleading names is also a benefit. So If people want to keep the old name it can be done. But I see these name changes as desirable. > I don't like mix of semantic refactoring and syntactic refactoring in > the same patch. While I could agree with replacing `SamplerRandomState > => pg_prng_state`, I'd rather see it in separate commit. And that > separate commit could contain transition: `typedef uint64 > pg_prng_state[2];` => `typedef struct { uint64 s0, s1 } pg_prng_state;` I would tend to agree on principle, but separating in two phases here looks pointless: why implementing a cleaner rand48 interface, which would then NOT be the rand48 standard, just to upgrade it to something else in the next commit? And the other path is as painfull and pointless. So I think that the new feature better comes with its associated refactoring which is an integral part of it. > 4. There is no need in ReservoirStateData->randstate_initialized. There could > be macros/function: > `bool pg_prng_initiated(state) { return (state[0]|state[1]) != 0; }` It would work for this peculiar implementation but not necessary for others that we may want to substitute later, as it would mean either breaking the interface or adding a boolean in the structure if there is no special unintialized state that can be detected, which would impact memory usage and alignment. So I think it is better to keep it that way, usually the user knows whether its structure has been initialized, and the special case for reservoir where the user does not seem to know about that can handle its own boolean without impacting the common API or the data structure. > 5. Is there need for 128 bit prng at all? This is a 64 bits PRNG with a 128 bit state. We are generating 64 bits values, so we want a 64 bit PRNG. A prng state must be larger than its generated value, so we need sizeof(state) > 64 bits, hence at least 128 bits if we add 128 bits memory alignement. > And there is 4*32bit xoshiro128: > https://prng.di.unimi.it/xoshiro128plusplus.c > 32bit operations are faster on 32bit platforms. > But 32bit platforms are quite rare in production this days. > Therefore I don't have strong opinion on this. I think that 99.9% hardware running postgres is 64 bits, so 64 bits is the right choice. -- Fabien.
pgsql-hackers by date: