Re: Generating random unique alphanumeric IDs - Mailing list pgsql-general

From Sam Mason
Subject Re: Generating random unique alphanumeric IDs
Date
Msg-id 20090816122526.GW5407@samason.me.uk
Whole thread Raw
In response to Re: Generating random unique alphanumeric IDs  (Thom Brown <thombrown@gmail.com>)
Responses Re: Generating random unique alphanumeric IDs
List pgsql-general
On Sun, Aug 16, 2009 at 12:57:34PM +0100, Thom Brown wrote:
> >  SELECT array_to_string(array((
> >    SELECT SUBSTRING('abcdefghjklmnpqrstuvwxyz23456789'
> >                     FROM mod((random()*32)::int, 32)+1 FOR 1)
> >    FROM generate_series(1,5))),'');

I've just had a look and PG does actually seem to be returning values as
I'd expect, i.e. 0 <= n < 1.  So the following would work:

  floor(random()*32)::int+1

instead of the mod hack.  Distribution looks reasonably flat (this is
good):

char  %occurances
   1  3.1222
   2  3.1329
   3  3.1269
   4  3.1236
   5  3.1233
   6  3.1310
   7  3.1226
   8  3.1298
   9  3.1229
  10  3.1294
  11  3.1192
  12  3.1249
  13  3.1267
  14  3.1236
  15  3.1190
  16  3.1279
  17  3.1232
  18  3.1218
  19  3.1314
  20  3.1091
  21  3.1337
  22  3.1239
  23  3.1184
  24  3.1347
  25  3.1205
  26  3.1160
  27  3.1219
  28  3.1344
  29  3.1118
  30  3.1256
  31  3.1408
  32  3.1255

> I'm not sure why you're saying that there's a 50%
> chance of duplication after 7240 values though.  With 33 million
> combinations, I would have thought that duplications would become equally
> likely at the 16,777,216 mark.

No, that's why I pointed out birthday attacks---collisions happen much
more often than you'd expect.  Get 23 people in a room and you have a
50% chance of two people having the same birthday--not 150 people.  This
is why it's called the birthday attack and it's one of the basic tests
for hash functions--any bias in their output will shrink this number
even further.

--
  Sam  http://samason.me.uk/

pgsql-general by date:

Previous
From: Ivan Sergio Borgonovo
Date:
Subject: Re: Generating random unique alphanumeric IDs
Next
From: NTPT
Date:
Subject: Rapid Seek Devices (feature request)