Thread: HOWTO - Random character generation for primary key

HOWTO - Random character generation for primary key

From
Jean-Michel POURE
Date:
Le Jeudi 2 Mai 2002 04:01, Alan Wayne a écrit :
> In foxpro, I've been using the provided 10-character
> random generator to provide for the primary keys on my
> data files. As far as I can tell, PostgreSQL only uses
> an auto-incrementing serial field for its primary
> keys. So as far as I can see, I'm left with the
> options of somehow converting all my existing records
> (with their relationships intact) to some form of
> integer key, or providing a similiar random character
> generator from PostgreSQL. So what to do????

Dear Alan,

Do not hesitate to CC pgsql-general@postgresql.org and pgsql-admin so that
anyone can participate.

I don't know if plain SQL supports loops, I guess not (?). Writing a primary
key generator is quite easy under PostgreSQL using PLpgSQL. So let's go for
PLpgSQL:

First of all, enable PLpgSQL in your database:

Under postgresql user:
postgres@locahost>createlang plpgsql database_name

Then add this PLpgSQL script to your database:

CREATE FUNCTION "random_string"("int4") RETURNS "varchar" AS '
DECLARE
iLoop int4;
result varchar;

BEGIN
result = '''';
IF ($1>0) AND ($1 < 255) THEN
  FOR iLoop in 1 .. $1 LOOP
    result = result || chr(int4(random()*26)+65);
  END LOOP;
  RETURN result;
ELSE
  RETURN ''f'';
END IF;
END;
'  LANGUAGE 'plpgsql';

chr() is multi-byte safe. This means it will work for an UTF-8 or ASCII
database. Not all PostgreSQL system functions are commented and described. A
good way to learn how to use these functions is to run pgAdmin2
(http://pgadmin.postgresql.org) and choose the "'display system objects"
option. Each function has a small comment which provides a small description.

PLpgSQL syntax is close to Pascal and SQL. More information can be found on:
http://www.postgresql.org/idocs/index.php?programmer-pl.html

PostgreSQL advantage compared to other open-source systems, like MySQL for
example, is the ability to write server-side code.

It is highliy recommanded to migrate some of your existing Foxpro / Access
code to PostgreSQL server-side, for the following reasons:

- a server-side application has a better transactional behavior (ex: perform
complex actions when you add / drop / update a table within transactions).

- a PostgreSQL server-side application is ***much faster*** than a flat table
application (PostgreSQL is able to process complex scripts involving several
tables in ONE client-side query. In some situations, it can boost the speed
by a 10 to 100 factor).

After migration, your Foxpro / Access application should look like a simple
data "viewer" or "browser" application. Furthermore, it allows you to add a
Web interface (ex: PHP) quite easily because, again, PHP will be used for
data viewing, not data processing.

If you need more information, do not hesitate to get back to me and/or post
comments on the list.

Cheers,
Jean-Michel

Re: HOWTO - Random character generation for primary key

From
postgres@vrane.com
Date:
> Le Jeudi 2 Mai 2002 04:01, Alan Wayne a écrit :
> > data files. As far as I can tell, PostgreSQL only uses
> > an auto-incrementing serial field for its primary
> > keys. So as far as I can see, I'm left with the

Untrue.  I am using email addresses
as primary keys in a table.  Where
did you get that information?


Re: HOWTO - Random character generation for primary key

From
Jean-Michel POURE
Date:
Le Jeudi 2 Mai 2002 11:30, postgres@vrane.com a écrit :
> Untrue.  I am using email addresses
> as primary keys in a table.  Where
> did you get that information?

Sorry, I did not mean that of course. I posted this small howto in reply to a
mail. It shows how to create a VACHAR PRIMARY KEY with unique random values.

On 10 character lenght, there is approximatively one chance out of 26^10 to
have a similar value. If you add a string timestamp, like "20020501_1153_" in
front of the random value, it is very likely that this value will be UNIQUE
in the word:

Examples:
- 20020501_1153_AHFYRIDKRN
- 20020501_1154_JDIFTPWNEJ

I would like to write a small HOWTOs about :
- Software optimization and performance (not pseudo-hardware / memory /
etc..),
- PLpgSQL and PLpython.

Therefore, I try to send small HOWTOs on the list whenever I can answer
someone. Any help welcome.

Best regards,
Jean-Michel POURE

Re: HOWTO - Random character generation for primary key

From
Jean-Michel POURE
Date:
Le Jeudi 2 Mai 2002 12:13, postgres@vrane.com a écrit :
> > Le Jeudi 2 Mai 2002 11:30, postgres@vrane.com a écrit :
> > > Untrue.  I am using email addresses
> > > as primary keys in a table.  Where
> > > did you get that information?
> >
> > Sorry, I did not mean that of course. I posted this small howto in reply
> > to a mail. It shows how to create a VACHAR PRIMARY KEY with unique random
> > values.
> > On 10 character lenght, there is approximatively one chance out of 26^10
> > to have a similar value. If you add a string timestamp, like
> > "20020501_1153_" in front of the random value, it is very likely that
> > this value will be UNIQUE in the word:
>
> In the real world "very likely" is not good enough.  Can you gurantee
> 'uniqueness'?

A timestamp (ex: 20020501_1258) followed by PLpgSQL function
random_string(200) value should provide a unique signature. What do you
think? Alternatively, could we use the crypto package to generate a unique
signature?

Cheers,
Jean-Michel POURE

Re: HOWTO - Random character generation for primary key

From
Bruno Wolff III
Date:
On Thu, May 02, 2002 at 01:01:22PM +0200,
  Jean-Michel POURE <jm.poure@freesurf.fr> wrote:
>
> A timestamp (ex: 20020501_1258) followed by PLpgSQL function
> random_string(200) value should provide a unique signature. What do you
> think? Alternatively, could we use the crypto package to generate a unique
> signature?

Well what are you really trying to do? Presumably you have some reason(s)
for not wanting to use sequences, but without knowing what the reason(s)
are it is hard to give you good advice.

Re: HOWTO - Random character generation for primary key

From
Jean-Michel POURE
Date:
Dear Alan,

When you write me, please CC me on pgsql-general@postgresql.org so that anyone
can participate.

> Question: What would happen if I did the following:
>    1. used some variant of "alter table" to change the
> character field primary key to a field of type
> 'serial'? i.e., would the binary form of the current
> 10 length characters be preserved as some kind of
> integer?

In PostgreSQL, serial values are int4 auto-increment values. Therefore, there
is no easy way to migrate your 10 characters long primary keys.

By the way PostgreSQL does not support type promotion <-> demotion. You will
have to wait for 7.3 or later to convert column types. For example, you
cannot change an in4 into an int8, a varchar into a text column.

Presently the solution is to add an int4 field to your tables ... and fill
them with incremental values ... and update sequence values by hand.

When your done, rename your tables with '_old', recreate them without 10
characters primary keys and fill them with data.

This should be easy in pgAdmin2 because you can copy table definition
and paste it in the execution window. pgAdmin2 also gives you access to
sequences.

>   2. If postgresql does allow me to change the primary
> key field from character to type serial (i.e.,
> integer) in the first table, what will the referential
> integrity rules do to the other tables that use the
> first table's primary key as a secondary key? i.e.,
> does postgre preceive the change of data-type as a
> change to be echoed to the referencing tables via the
> referential integrity rules? Furthermore, would this
> recognition only be on newly added records, or on the
> records already in the tables? e.g., if no change is
> detected until a new record is added, could I go
> through the database one table at a time and change
> the referencing fields to type integer to match the
> change in the primary key? (Needless to say, I can't
> alter the keys without all the related records being
> changed too--or I lose my relationships).
> The current records are using pure characters of 10
> byte length. If postgre could accept the above changes
> before adding new records, then could the simple type
> 'serial' be used without having the default produced
> integers on new records clobber the existing
> 'characters' ?

If you don't want to migrate, set primary key column default value to
random_string(10). See my previous HOWTO.

Please note this is not a very standard way to proceed. In a profesionnal
environment, you should use integer primary keys.

Example :
CREATE TABLE foo (
foo_oid serial,
foo_name varchar(254),
foo_text text)
WITH OIDS;

is better than

CREATE TABLE bar (
bar_key char(10) random_string(10),
bar_name varchar(254),
bar_text text)
WITH OIDS;

Re: HOWTO - Random character generation for primary key

From
Alan Wayne
Date:
Hi!
So ultimately, the better way would be to
bite-the-bullet and work towards replacing the current
character keys with int4 keys?

(This will be quite time consuming on several million
records and about 35 tables with referential integrity
rules--a quick and dirty way would be appreciated.)

Cheers,
Alan

--- Jean-Michel POURE <jm.poure@freesurf.fr> wrote:
> Dear Alan,
>
> When you write me, please CC me on
> pgsql-general@postgresql.org so that anyone
> can participate.
>
> > Question: What would happen if I did the
> following:
> >    1. used some variant of "alter table" to change
> the
> > character field primary key to a field of type
> > 'serial'? i.e., would the binary form of the
> current
> > 10 length characters be preserved as some kind of
> > integer?
>
> In PostgreSQL, serial values are int4 auto-increment
> values. Therefore, there
> is no easy way to migrate your 10 characters long
> primary keys.
>
> By the way PostgreSQL does not support type
> promotion <-> demotion. You will
> have to wait for 7.3 or later to convert column
> types. For example, you
> cannot change an in4 into an int8, a varchar into a
> text column.
>
> Presently the solution is to add an int4 field to
> your tables ... and fill
> them with incremental values ... and update sequence
> values by hand.
>
> When your done, rename your tables with '_old',
> recreate them without 10
> characters primary keys and fill them with data.
>
> This should be easy in pgAdmin2 because you can copy
> table definition
> and paste it in the execution window. pgAdmin2 also
> gives you access to
> sequences.
>
> >   2. If postgresql does allow me to change the
> primary
> > key field from character to type serial (i.e.,
> > integer) in the first table, what will the
> referential
> > integrity rules do to the other tables that use
> the
> > first table's primary key as a secondary key?
> i.e.,
> > does postgre preceive the change of data-type as a
> > change to be echoed to the referencing tables via
> the
> > referential integrity rules? Furthermore, would
> this
> > recognition only be on newly added records, or on
> the
> > records already in the tables? e.g., if no change
> is
> > detected until a new record is added, could I go
> > through the database one table at a time and
> change
> > the referencing fields to type integer to match
> the
> > change in the primary key? (Needless to say, I
> can't
> > alter the keys without all the related records
> being
> > changed too--or I lose my relationships).
> > The current records are using pure characters of
> 10
> > byte length. If postgre could accept the above
> changes
> > before adding new records, then could the simple
> type
> > 'serial' be used without having the default
> produced
> > integers on new records clobber the existing
> > 'characters' ?
>
> If you don't want to migrate, set primary key column
> default value to
> random_string(10). See my previous HOWTO.
>
> Please note this is not a very standard way to
> proceed. In a profesionnal
> environment, you should use integer primary keys.
>
> Example :
> CREATE TABLE foo (
> foo_oid serial,
> foo_name varchar(254),
> foo_text text)
> WITH OIDS;
>
> is better than
>
> CREATE TABLE bar (
> bar_key char(10) random_string(10),
> bar_name varchar(254),
> bar_text text)
> WITH OIDS;


__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com

Re: HOWTO - Random character generation for primary key

From
Lincoln Yeoh
Date:
What are you trying to achieve? There doesn't seem to be enough information
to help you properly.

a) What do your 10 character keys look like? visible ASCII? legal Base64
charset?
b) How were they generated previously?
c) Why do you think they need to look that way?
d) Can you really change your app to use ints?

Regards,
Link.

At 12:09 PM 5/4/02 -0700, Alan Wayne wrote:
>Hi!
>So ultimately, the better way would be to
>bite-the-bullet and work towards replacing the current
>character keys with int4 keys?
>
>(This will be quite time consuming on several million
>records and about 35 tables with referential integrity
>rules--a quick and dirty way would be appreciated.)
>
>Cheers,
>Alan
>
>--- Jean-Michel POURE <jm.poure@freesurf.fr> wrote:
> > Dear Alan,
> >
> > When you write me, please CC me on
> > pgsql-general@postgresql.org so that anyone
> > can participate.
> >
> > > Question: What would happen if I did the
> > following:
> > >    1. used some variant of "alter table" to change
> > the
> > > character field primary key to a field of type
> > > 'serial'? i.e., would the binary form of the
> > current
> > > 10 length characters be preserved as some kind of
> > > integer?
> >
> > In PostgreSQL, serial values are int4 auto-increment
> > values. Therefore, there
> > is no easy way to migrate your 10 characters long
> > primary keys.
> >
> > By the way PostgreSQL does not support type
> > promotion <-> demotion. You will
> > have to wait for 7.3 or later to convert column
> > types. For example, you
> > cannot change an in4 into an int8, a varchar into a
> > text column.
> >
> > Presently the solution is to add an int4 field to
> > your tables ... and fill
> > them with incremental values ... and update sequence
> > values by hand.
> >
> > When your done, rename your tables with '_old',
> > recreate them without 10
> > characters primary keys and fill them with data.
> >
> > This should be easy in pgAdmin2 because you can copy
> > table definition
> > and paste it in the execution window. pgAdmin2 also
> > gives you access to
> > sequences.
> >
> > >   2. If postgresql does allow me to change the
> > primary
> > > key field from character to type serial (i.e.,
> > > integer) in the first table, what will the
> > referential
> > > integrity rules do to the other tables that use
> > the
> > > first table's primary key as a secondary key?
> > i.e.,
> > > does postgre preceive the change of data-type as a
> > > change to be echoed to the referencing tables via
> > the
> > > referential integrity rules? Furthermore, would
> > this
> > > recognition only be on newly added records, or on
> > the
> > > records already in the tables? e.g., if no change
> > is
> > > detected until a new record is added, could I go
> > > through the database one table at a time and
> > change
> > > the referencing fields to type integer to match
> > the
> > > change in the primary key? (Needless to say, I
> > can't
> > > alter the keys without all the related records
> > being
> > > changed too--or I lose my relationships).
> > > The current records are using pure characters of
> > 10
> > > byte length. If postgre could accept the above
> > changes
> > > before adding new records, then could the simple
> > type
> > > 'serial' be used without having the default
> > produced
> > > integers on new records clobber the existing
> > > 'characters' ?
> >
> > If you don't want to migrate, set primary key column
> > default value to
> > random_string(10). See my previous HOWTO.
> >
> > Please note this is not a very standard way to
> > proceed. In a profesionnal
> > environment, you should use integer primary keys.
> >
> > Example :
> > CREATE TABLE foo (
> > foo_oid serial,
> > foo_name varchar(254),
> > foo_text text)
> > WITH OIDS;
> >
> > is better than
> >
> > CREATE TABLE bar (
> > bar_key char(10) random_string(10),
> > bar_name varchar(254),
> > bar_text text)
> > WITH OIDS;
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! Health - your guide to health and wellness
>http://health.yahoo.com
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



Re: HOWTO - Random character generation for primary key

From
"Joel Burton"
Date:
Could you:

. Keep your 10-letter char keys in table

. create a sequence (manually, with CREATE SEQUENCE)

. create the table with a default value of next_charkey()

. next_charkey() calls nextval() to get the next value of the sequence and
converts this to a 10-char symbol (pick your own int -> char conversion
routine) and returns to the table

There will be more overhead, so if you're adding lots of records,
next_charkey() should be done in C.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Lincoln Yeoh
> Sent: Monday, May 06, 2002 9:47 AM
> To: Alan Wayne; jm.poure@freesurf.fr
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] HOWTO - Random character generation for primary
> key
>
>
> What are you trying to achieve? There doesn't seem to be enough
> information
> to help you properly.
>
> a) What do your 10 character keys look like? visible ASCII? legal Base64
> charset?
> b) How were they generated previously?
> c) Why do you think they need to look that way?
> d) Can you really change your app to use ints?
>
> Regards,
> Link.
>
> At 12:09 PM 5/4/02 -0700, Alan Wayne wrote:
> >Hi!
> >So ultimately, the better way would be to
> >bite-the-bullet and work towards replacing the current
> >character keys with int4 keys?
> >
> >(This will be quite time consuming on several million
> >records and about 35 tables with referential integrity
> >rules--a quick and dirty way would be appreciated.)
> >
> >Cheers,
> >Alan
> >
> >--- Jean-Michel POURE <jm.poure@freesurf.fr> wrote:
> > > Dear Alan,
> > >
> > > When you write me, please CC me on
> > > pgsql-general@postgresql.org so that anyone
> > > can participate.
> > >
> > > > Question: What would happen if I did the
> > > following:
> > > >    1. used some variant of "alter table" to change
> > > the
> > > > character field primary key to a field of type
> > > > 'serial'? i.e., would the binary form of the
> > > current
> > > > 10 length characters be preserved as some kind of
> > > > integer?
> > >
> > > In PostgreSQL, serial values are int4 auto-increment
> > > values. Therefore, there
> > > is no easy way to migrate your 10 characters long
> > > primary keys.
> > >
> > > By the way PostgreSQL does not support type
> > > promotion <-> demotion. You will
> > > have to wait for 7.3 or later to convert column
> > > types. For example, you
> > > cannot change an in4 into an int8, a varchar into a
> > > text column.
> > >
> > > Presently the solution is to add an int4 field to
> > > your tables ... and fill
> > > them with incremental values ... and update sequence
> > > values by hand.
> > >
> > > When your done, rename your tables with '_old',
> > > recreate them without 10
> > > characters primary keys and fill them with data.
> > >
> > > This should be easy in pgAdmin2 because you can copy
> > > table definition
> > > and paste it in the execution window. pgAdmin2 also
> > > gives you access to
> > > sequences.
> > >
> > > >   2. If postgresql does allow me to change the
> > > primary
> > > > key field from character to type serial (i.e.,
> > > > integer) in the first table, what will the
> > > referential
> > > > integrity rules do to the other tables that use
> > > the
> > > > first table's primary key as a secondary key?
> > > i.e.,
> > > > does postgre preceive the change of data-type as a
> > > > change to be echoed to the referencing tables via
> > > the
> > > > referential integrity rules? Furthermore, would
> > > this
> > > > recognition only be on newly added records, or on
> > > the
> > > > records already in the tables? e.g., if no change
> > > is
> > > > detected until a new record is added, could I go
> > > > through the database one table at a time and
> > > change
> > > > the referencing fields to type integer to match
> > > the
> > > > change in the primary key? (Needless to say, I
> > > can't
> > > > alter the keys without all the related records
> > > being
> > > > changed too--or I lose my relationships).
> > > > The current records are using pure characters of
> > > 10
> > > > byte length. If postgre could accept the above
> > > changes
> > > > before adding new records, then could the simple
> > > type
> > > > 'serial' be used without having the default
> > > produced
> > > > integers on new records clobber the existing
> > > > 'characters' ?
> > >
> > > If you don't want to migrate, set primary key column
> > > default value to
> > > random_string(10). See my previous HOWTO.
> > >
> > > Please note this is not a very standard way to
> > > proceed. In a profesionnal
> > > environment, you should use integer primary keys.
> > >
> > > Example :
> > > CREATE TABLE foo (
> > > foo_oid serial,
> > > foo_name varchar(254),
> > > foo_text text)
> > > WITH OIDS;
> > >
> > > is better than
> > >
> > > CREATE TABLE bar (
> > > bar_key char(10) random_string(10),
> > > bar_name varchar(254),
> > > bar_text text)
> > > WITH OIDS;
> >
> >
> >__________________________________________________
> >Do You Yahoo!?
> >Yahoo! Health - your guide to health and wellness
> >http://health.yahoo.com
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>