Re: gaussian distribution pgbench -- splits v4 - Mailing list pgsql-hackers

From Mitsumasa KONDO
Subject Re: gaussian distribution pgbench -- splits v4
Date
Msg-id CADupcHWoU=L+1zPY+WfxXH=VoofeZkMGVhDJpWD4nWiW2H8oSQ@mail.gmail.com
Whole thread Raw
In response to Re: gaussian distribution pgbench -- splits v4  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Hi,

2014-08-01 16:26 GMT+09:00 Fabien COELHO <coelho@cri.ensmp.fr>

Maybe somebody who knows more math than I do (like you, probably!) can come up with something more clever.

I can certainly suggest other formula, but that does not mean beautiful code, thus would probably be rejected. I'll see.

An alternative to this whole process may be to hash/modulo a non uniform random value.

      id = 1 + hash(some-random()) % n

But the hashing changes the distribution as it adds collisions, so I have to think about how to be able to control the distribution in that case, and what hash function to use.
I think that we have to consider and select reproducible method, because benchmark is always needed robust and reproducible result. And if we realize this idea, we might need more accurate random generator that is like Mersenne twister algorithm.  erand48 algorithm is slow and not accurate very much.  

By the way, I don't know relativeness of this topic and command line option... Well whatever... 

Regards,
--
Mitsumasa KONDO

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: gaussian distribution pgbench -- splits v4
Next
From: Anastasia Lubennikova
Date:
Subject: Index-only scans for GIST