Re: Create on insert a unique random number - Mailing list pgsql-sql

From Steve Midgley
Subject Re: Create on insert a unique random number
Date
Msg-id 20080319152835.8835E2E002C@postgresql.org
Whole thread Raw
In response to Re: Create on insert a unique random number  ("D'Arcy J.M. Cain" <darcy@druid.net>)
Responses Re: Create on insert a unique random number
List pgsql-sql
At 06:47 AM 3/19/2008, D'Arcy J.M. Cain wrote:
>But your suggestion was to base this key on the serial primary key so
>where is your index collision protection?  You are going to get
>collisions on both the serial key and, to a lesser extent, your
>generated one. Besides, has anyone ever demonstrated a real issue with
>lookups using serial primary keys? I think you are trying to second
>guess the database engine with this and I don't think that that is a
>great idea.
Hi D'Arcy,

I'm not following this line. Maybe we're talking about two different 
things here.. I don't know if Lance is using "CRUD" methodology per se, 
but that's a well accepted web approach and uses (generally) serial 
primary keys in the URL structure as (where numbers are serial pk's):

[website]/contact/12345
[website]/property/45678  [and the client sends GET, POST, PUT, DELETE http requests, or 
mimics, to activate various functions]

Whether CRUD of otherwise, in the model I was promoting, there would be 
two index columns in the table along with other data, a public index 
and a serial primary key. The public index is based on the primary key:

pk | public_pk
1  | md5(1 + fixed salt)
2  | md5(2 + fixed salt)
...

AFAIK, an MD5 hash is guaranteed to generate a unique output for any 
unique input, so the serial key and fixed salt would guarantee no hash 
index collisions on the MD5 output. Of course if a competitor knows 
you're using MD5 and they know your salt, they could calculate all the 
md5 integer hashes and see which ones exist..

But I could care less if he uses md5 or sha-1 or Guids! (I just picked 
MD5 because another poster recommended it and it's very easy to 
implement in Pg). The point I care about is that there would be a 
public_pk that associates to one-and-only-one serial pk. Also that 
public_pk should be 1) not easily guessable, 2) non-clustering (and 
therefore non-serial). Then his url's would look like something like:

[website]/contact/c4ca4238a0b923820dcc509a6f75849b
[website]/property/c81e728d9d4c2f636f067f89cc14862c

> > The issue is about creating an index into a sparse hash so that 
> each
> > record is somewhat randomly located in a sparse hash "index space". 
>
> > (One valid reason to do this would be if you wanted to hide the 
> total
> > number of records in your table from competitors or customers). 
> (Just
>
>If that is your goal then start your serial at something other than 1.
>Start at 1,000,000 for example and your first user will think that
>you already have one million clients.  Actually, he will think that
>you started elsewhere than 1 but he won't know where.

The original post did not want users to be able to type in random 
integers like:

/contact/343

And find out if that record #343 exists or not (regardless of whether 
they can get access to the record - the error generated on 
no-authorization may be different from record-not-found). So starting 
at a million does not fix the OP's issue.
From my perspective, wherever you start your serial index, competitors 
can watch it grow over time, if it's a numeric serial. That could be 
more valuable in many businesses than knowing the initial size of the 
table.

Anyway, I hope that clears up what I was recommending! I didn't 
anticipate it would stir up this much analysis and I hope the OP finds 
your input and mine useful in coming up with a final answer to his 
issue. Thanks for taking the time to consider the issue and I'll look 
forward to any additional ideas or comments you have on this too!

Sincerely,

Steve



pgsql-sql by date:

Previous
From: "D'Arcy J.M. Cain"
Date:
Subject: Re: Create on insert a unique random number
Next
From: "D'Arcy J.M. Cain"
Date:
Subject: Re: Create on insert a unique random number