Re: random() generates collisions too early - Mailing list pgsql-bugs
From | Joe Van Dyk |
---|---|
Subject | Re: random() generates collisions too early |
Date | |
Msg-id | CACfv+pJ--Ky6TAmtNLke7ZqSZ8rE2t6ugTbH0d9=M=7io3Wwrg@mail.gmail.com Whole thread Raw |
In response to | random() generates collisions too early (Honza Horak <hhorak@redhat.com>) |
List | pgsql-bugs |
Oddly enough, I'm debugging a problem with a function that uses random() and that is now generating collisions after I upgraded to 9.3. CREATE OR REPLACE FUNCTION public.random_characters(length integer) RETURNS text LANGUAGE sql AS $function$ SELECT array_to_string(array(( SELECT SUBSTRING('abcdefghjkmnpqrstuvwxyz23456789' FROM mod((random()*31)::int, 31)+1 FOR 1) FROM generate_series(1, $1))),''); $function$; It's being used in a column definition like: citext unique not null default upper(random_characters(8)) I can't make a self-contained reproducible test case now, I've tried but no luck yet. In the actual app, I'm getting collisions after only a couple hundred rows are inserted. I never had a problem with < 9.3. The rows are being inserted in a trigger, if that matters. Joe On Fri, Oct 18, 2013 at 4:55 AM, Honza Horak <hhorak@redhat.com> wrote: > Hi guys, > > after playing a bit with "select random();", it turned out that numbers > get repeated quite early in a sequence. Originally I set lower PID range > (echo "2048" >/proc/sys/kernel/pid_max), but it doesn't seem to affect the > results. > > So, what I observed... First, I generated a set including 1000 randomly > generated numbers without setting a seed. > > touch numbers > for i in {1..1000} ; do > echo "select random();"|psql|head -n 3|tail -n 1 >>numbers > done > > Then, I continued in generating random numbers and tried to find the new > one in the set: > > for i in {1..10000} ; do > if grep `echo "select random();"|psql|head -n 3|tail -n 1` numbers ; > then > echo "SUCCESS: $i" ; break > fi > done > > To my surprise I'm able to find a collision very quickly, in first 1000 > numbers usually. > > Originally, I used psql calls to get random() value from different > processes on purpose, but it seems Noah got similar results when random() > is called in one process: > > On 10/18/2013 02:10 AM, Noah Misch wrote: > > sudo sysctl -w kernel.pid_max=2048 > > psql -c 'create unlogged table samp(c float8)' > > for n in `seq 1 200000`; do psql -qc 'insert into samp values > (random())'; done > > > > The results covered only 181383 distinct values, and 68 values repeated > four > > or five times each. We should at least consider using a higher-entropy > seed. > > As I was told this is not taken as a security issue, since random() is not > considered as a CSPRNG in any case, but as Noah said, we should probably > try to make it a bit better. > > Also, I'd suggest to state explicitly in the doc, that random() shouldn't > be taken as CSPRNG, since I can imagine people blindly believing that > random() can be good enough for such use cases, just because they see how > many possible values they get from double-precision type: > http://www.postgresql.org/**docs/9.3/static/functions-**math.html<http://www.postgresql.org/docs/9.3/static/functions-math.html> > > Regards, > Honza > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/**mailpref/pgsql-bugs<http://www.postgresql.org/mailpref/pgsql-bugs> >
pgsql-bugs by date: