Re: Random-looking primary keys in the range 100000..999999 - Mailing list pgsql-general

From Gavin Flower
Subject Re: Random-looking primary keys in the range 100000..999999
Date
Msg-id 53B778DF.6010604@archidevsys.co.nz
Whole thread Raw
In response to Re: Random-looking primary keys in the range 100000..999999  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
List pgsql-general
On 05/07/14 15:48, Gavin Flower wrote:
On 05/07/14 01:24, Kynn Jones wrote:
I'm looking for a way to implement pseudorandom primary keys in the range 100000..999999.

The randomization scheme does not need to be cryptographically strong.  As long as it is not easy to figure out in a few minutes it's good enough.

My starting point for this is the following earlier message to this list:

http://www.postgresql.org/message-id/49F96730.4000706@postnewspapers.com.au

The answer given to it here

http://www.postgresql.org/message-id/448163db-cac5-4e99-8c4c-57cbc6f6af78@mm

...is really cool, but I don't see how to modify it for the case where the domain of the permutation has a cardinality that is not a power of 2, as it is in my case (cardinality = 900000).

---

(In the crypto world there are "format preserving encryption" techniques that probably could do what I want to do, but their focus on cryptographic strength makes learning and implementing them tough going, plus, the performance will probably be poor, since high workloads are an asset for such crypto applications.  Since cryptographic strength is not something I need, I'm trying to find non-crypt-grade alternatives.)

Thanks in advance!

kynn

Hi Kynn,

How about  (note that 'payload' could be any set of valid columns):

-- using a crude Linear Congruential Generator
-- not very random, but does NOT create duplicates


DROP TABLE IF EXISTS rtab;
DROP SEQUENCE IF EXISTS rseq;

CREATE SEQUENCE rseq;

CREATE TABLE rtab
(
    id       int PRIMARY KEY default(100000 + (nextval('rseq') * 543537 + 997) % 900000),
    payload  int NOT NULL
);

INSERT INTO rtab (payload) VALUES (generate_series(1, 100000));

TABLE rtab;
Sample output:

   id   | payload
--------+---------
 644534 |       1
 288071 |       2
 831608 |       3
 475145 |       4
 118682 |       5
 662219 |       6
 305756 |       7
 849293 |       8
 492830 |       9
 136367 |      10
 679904 |      11
 323441 |      12
 866978 |      13
 510515 |      14
 154052 |      15
 697589 |      16
 341126 |      17
 884663 |      18
 528200 |      19
 171737 |      20


Cheers,
Gavin
Hmm...

for a 10 times larger range
    id       int PRIMARY KEY default(1000000 + (nextval('rseq') * 543537 + 997) % 9000000),
also works!

pgsql-general by date:

Previous
From: Gavin Flower
Date:
Subject: Re: Random-looking primary keys in the range 100000..999999
Next
From: Tom Lane
Date:
Subject: Re: pg_dump slower than pg_restore