Thread: random() in multi-threaded pgbench
While testing the pgbench setshell command patch with -j option, I found all threads use the same sequence of random value. At first, I think we need to call srandom() in each thread, but the manual says we should use random_r() instead of random() on multi-threaded programs. http://www.kernel.org/doc/man-pages/online/pages/man3/random_r.3.html Should we replace random() to random_r()? Some configure test might be needed. Regards, --- Takahiro Itagaki NTT Open Source Software Center
Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> writes: > While testing the pgbench setshell command patch with -j option, > I found all threads use the same sequence of random value. Were they actually threads, or were you testing the code while it had the broken configure script that didn't set ENABLE_THREAD_SAFETY? I think you might have hit the same thing I just ran into, that in a *non-threaded* build each subprocess will generate the same random sequence. > At first, I think we need to call srandom() in each thread, Each sub-job I think. > but the manual says we should use random_r() instead of random() > on multi-threaded programs. > http://www.kernel.org/doc/man-pages/online/pages/man3/random_r.3.html It only says that you need those if you want an *independent* random sequence for each thread. pgbench never had that before and I doubt we need it now. In any case, the same page also says these are a glibc-ism not a standard API, so we can't really rely on them. regards, tom lane
I wrote: > Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> writes: >> http://www.kernel.org/doc/man-pages/online/pages/man3/random_r.3.html > It only says that you need those if you want an *independent* random > sequence for each thread. pgbench never had that before and I doubt > we need it now. In any case, the same page also says these are a > glibc-ism not a standard API, so we can't really rely on them. ... although if anyone was sufficiently excited about this, we could make use of erand48() without adding any new platform dependency, since GEQO is already relying on that. I think there's not much point though. We'd just be moving the nearest point of failure to the seeding algorithm --- careless seeding could still result in duplicate sequences for different threads. regards, tom lane