Thread: Using a postgres table to maintain unique id?

Using a postgres table to maintain unique id?

From
Steve Wampler
Date:
Hi,

I have a distributed application that needs unique 64-bit
"id" values.  The current implementation keeps the current
value for this id in a file on one system and uses
a process to provide access to this file via CORBA.

However, I also use Postgres (7.0.2) throughout this
application and it seems cleaner to me to keep the current
id value in a table and just use postgres to provide access
(with a trigger function to increment the id on access).

Is this reasonable?  Is it fast?  (I need 10 or more IDs
generated each second.)  Can I avoid having the table
gradually fill with "old" rows for this entry, and this
avoid the need to run VACUUM ANALYZE periodically?

Any tips on how to implement the trigger would be
appreciated.  (Is it possible to generate an int8 sequence
value?)

Thanks!
--
Steve Wampler-  SOLIS Project, National Solar Observatory
swampler@noao.edu


Re: Using a postgres table to maintain unique id?

From
Poet/Joshua Drake
Date:
>However, I also use Postgres (7.0.2) throughout this
>application and it seems cleaner to me to keep the current
>id value in a table and just use postgres to provide access
>(with a trigger function to increment the id on access).

Why not a sequence?

>Is this reasonable?  Is it fast?  (I need 10 or more IDs
>generated each second.)  Can I avoid having the table
>gradually fill with "old" rows for this entry, and this
>avoid the need to run VACUUM ANALYZE periodically?

The only problem I have had with this type of thing is when a number gets
deleted, it does not get recycled.

Joshua Drake


>
>Any tips on how to implement the trigger would be
>appreciated.  (Is it possible to generate an int8 sequence
>value?)
>
>Thanks!
>--
>Steve Wampler-  SOLIS Project, National Solar Observatory
>swampler@noao.edu
>

-- 
--
<COMPANY>CommandPrompt    - http://www.commandprompt.com    </COMPANY>
<PROJECT>OpenDocs, LLC.    - http://www.opendocs.org    </PROJECT>
<PROJECT>LinuxPorts     - http://www.linuxports.com     </PROJECT>
<WEBMASTER>LDP        - http://www.linuxdoc.org    </WEBMASTER>
--
Instead of asking why a piece of software is using "1970s technology,"
start asking why software is ignoring 30 years of accumulated wisdom.
--



Re: Using a postgres table to maintain unique id?

From
Steve Wampler
Date:
Poet/Joshua Drake wrote:
> 
> >However, I also use Postgres (7.0.2) throughout this
> >application and it seems cleaner to me to keep the current
> >id value in a table and just use postgres to provide access
> >(with a trigger function to increment the id on access).
> 
> Why not a sequence?

Can someone show me how to create (and use) an int8 sequence?

> >Is this reasonable?  Is it fast?  (I need 10 or more IDs
> >generated each second.)  Can I avoid having the table
> >gradually fill with "old" rows for this entry, and this
> >avoid the need to run VACUUM ANALYZE periodically?
> 
> The only problem I have had with this type of thing is when a number gets
> deleted, it does not get recycled.

Fortunately, I don't want any number to ever get recycled - the id needs to
be unique throughout the 25+ year lifetime of the project.  The table
would have a single row with a single column.  Selecting that table cell
would return the current value, but leave the value incremented in the
table cell (it's ok if it increments the value before returning).


--
Steve Wampler-  SOLIS Project, National Solar Observatory
swampler@noao.edu


Re: Using a postgres table to maintain unique id?

From
Dr Frog
Date:
On Mon, 13 Nov 2000 10:44:21 -0700, Steve Wampler said:

> Poet/Joshua Drake wrote:
>  > 
>  > >However, I also use Postgres (7.0.2) throughout this
>  > >application and it seems cleaner to me to keep the current
>  > >id value in a table and just use postgres to provide access
>  > >(with a trigger function to increment the id on access).
>  > 
>  > Why not a sequence?
>  
>  Can someone show me how to create (and use) an int8 sequence?
>  


create sequnece seq_name ;


there are additional options 
start sql and type
drfrog=# \h create sequence
Command:     CREATE SEQUENCE
Description: Creates a new sequence number generator
Syntax:
CREATE SEQUENCE seqname [ INCREMENT increment ]   [ MINVALUE minvalue ] [ MAXVALUE maxvalue ]   [ START start ] [ CACHE
cache] [ CYCLE ]
 
drfrog=# 


more info in the docs too!



>  > >Is this reasonable?  Is it fast?  (I need 10 or more IDs
>  > >generated each second.)  Can I avoid having the table
>  > >gradually fill with "old" rows for this entry, and this
>  > >avoid the need to run VACUUM ANALYZE periodically?
>  > 
>  > The only problem I have had with this type of thing is when a number gets
>  > deleted, it does not get recycled.
>  
>  Fortunately, I don't want any number to ever get recycled - the id needs to
>  be unique throughout the 25+ year lifetime of the project.  The table
>  would have a single row with a single column.  Selecting that table cell
>  would return the current value, but leave the value incremented in the
>  table cell (it's ok if it increments the value before returning).
>  
>  
>  --
>  Steve Wampler-  SOLIS Project, National Solar Observatory
>  swampler@noao.edu
>  
>  



Re: Using a postgres table to maintain unique id?

From
Forest Wilkinson
Date:
On 13 Nov 2000 10:30:55 PST, Dr Frog wrote:

>create sequnece seq_name ;
>
>there are additional options 
>start sql and type

What are the "SQL" and "TYPE" options?  I don't see reference to them in
the docs.

Can I use the TYPE option to create a sequence that's based in int8 (64
bit integer) instead of int4?  That's exactly what I've been wanting!
What version of postgres supports this?




Re: Using a postgres table to maintain unique id?

From
Steve Wampler
Date:
Marten Feldtmann wrote:
> 
>  Throw away all the "hardwired"-stuff and do it with software. I
> once described an algorithm in one of this lists how to create
> unique values for clients without minimum interaction with the
> database.
> 
>  The result: query once in the beginning of your application,
> generate your id's "offline" at the maximum speed you may
> have and store your last generated id when your client
> finished. Superior to all the "hardwired"-database solutions !

Yes, but...

(1) The application I have is composed of about 50 processes   running on 3 different OS/architectures (Linux/intel,
Solaris/sparc,and VxWorks/ppc).  The IDs I need must be   unique across all processes (I suppose one solution would
beto provide each ID with a unique prefix based on the   process that is running, but...)
 

(2) Some of these systems are real-time boxes that might get   rebooted at any moment, or might hang for
hardware-related  reasons [I'd like to able to say that all of the processes   could detect imminent failure, but
unfortunately,I can't].   So determining when a client "finishes" is not always possible,   which prevents (he claims)
theabove solution from claiming   ID uniqueness.
 

However, it might be sufficient to provide a process on the
postgres DB machine (if *that* machine dies, *everything* stops...)
that serves IDs via CORBA to all the other applications and
(internally) uses the "software" approach given above.  This
process could "sync" with the database every N seconds or so
(where N might be < 1.0).  This, while still not guaranteeing
uniqueness, would at least come pretty close...  It would still be
nice to avoid having to VACUUM ANALYZE this table, though, and it
"feels" as though it is duplicating functionality already provided
by postgres DB backends.

I'll think about this solution - thanks!



--
Steve Wampler-  SOLIS Project, National Solar Observatory
swampler@noao.edu


Re: Using a postgres table to maintain unique id?

From
Michael Teter
Date:
> Can someone show me how to create (and use) an int8
> sequence?

From what I can tell (both from docs and doing a
describe on sequences in my database), a postgresql
sequence is an int4, not an int8, and thus you are
limited to a max of 2.1 billion values.

If you require an int8 sequence, you'll probably have
to manage your own and just use an int8 column.

> Fortunately, I don't want any number to ever get
> recycled - the id needs to
> be unique throughout the 25+ year lifetime of the
> project.  The table
> would have a single row with a single column. 
> Selecting that table cell
> would return the current value, but leave the value
> incremented in the
> table cell (it's ok if it increments the value
> before returning).

If 2.1 billion unique values is enough, then a
sequence will work fine.  Performance of sequence
should be ok.

An alternative you might consider is creating your own
"sequence server" external to the database.

Michael

__________________________________________________
Do You Yahoo!?
Yahoo! Calendar - Get organized for the holidays!
http://calendar.yahoo.com/


Re: Using a postgres table to maintain unique id?

From
Thomas Swan
Date:
At 11/13/2000 06:22 PM -0800, Michael Teter wrote:<br /><blockquote cite="cite" class="cite" type="cite">> Can
someoneshow me how to create (and use) an int8<br /> > sequence?<br /><br /> From what I can tell (both from docs
anddoing a<br /> describe on sequences in my database), a postgresql<br /> sequence is an int4, not an int8, and thus
youare<br /> limited to a max of 2.1 billion values.<br /><br /> If you require an int8 sequence, you'll probably
have<br/> to manage your own and just use an int8 column.<br /></blockquote><p> I had originally started using int8 and
creatingcustom sequences.   However, as mentioned in a previous post, there is an inherent performance penalty in using
int8over int4.   Tom Lane advised me that the int8 routines are an emulated or synthesized data type.  in the test I
didon our 7.0.2 server I notice about a 25-30% decrease in performance when using complex joins on tables containing
referentialkeys, primary keys all in the int8 data type.<br /><br /> This might be something to think about as well.<br
/><br/><br /><br /> 

Re: Using a postgres table to maintain unique id?

From
Steve Wampler
Date:
Thomas Swan wrote:
> 
> At 11/13/2000 06:22 PM -0800, Michael Teter wrote:
> >
> > From what I can tell (both from docs and doing a
> > describe on sequences in my database), a postgresql
> > sequence is an int4, not an int8, and thus you are
> > limited to a max of 2.1 billion values.
> >
> > If you require an int8 sequence, you'll probably have
> > to manage your own and just use an int8 column.
> >
> I had originally started using int8 and creating custom sequences.   However,
> as mentioned in a previous post, there is an inherent performance penalty in
> using int8 over int4.   Tom Lane advised me that the int8 routines are an
> emulated or synthesized data type.  in the test I did on our 7.0.2 server I
> notice about a 25-30% decrease in performance when using complex joins on
> tables containing referential keys, primary keys all in the int8 data type.
> 
> This might be something to think about as well.

Thanks.  Because of these and other comments people have made, I've gone back
to using a flat_file-with-server approach instead of adding a table to my
postgres DB.  While an int4 *might* work, it doesn't handle the "worst-case"
scenario (which is up around 15 billion values).

Thanks to everyone for your comments and suggestions!

--
Steve Wampler-  SOLIS Project, National Solar Observatory
swampler@noao.edu


Re: Using a postgres table to maintain unique id?

From
M.Feldtmann@t-online.de (Marten Feldtmann)
Date:

Steve Wampler schrieb:
> 
> Poet/Joshua Drake wrote:
> ?
> ? ?However, I also use Postgres (7.0.2) throughout this
> ? ?application and it seems cleaner to me to keep the current
> ? ?id value in a table and just use postgres to provide access
> ? ?(with a trigger function to increment the id on access).
> ?
> ? Why not a sequence?
> 
> Can someone show me how to create (and use) an int8 sequence?
> 
> ? ?Is this reasonable?  Is it fast?  (I need 10 or more IDs
> ? ?generated each second.)  Can I avoid having the table
> ? ?gradually fill with "old" rows for this entry, and this
> ? ?avoid the need to run VACUUM ANALYZE periodically?
Throw away all the "hardwired"-stuff and do it with software. I
once described an algorithm in one of this lists how to create 
unique values for clients without minimum interaction with the 
database.
The result: query once in the beginning of your application, 
generate your id's "offline" at the maximum speed you may
have and store your last generated id when your client
finished. Superior to all the "hardwired"-database solutions !

Marten


Re: Using a postgres table to maintain unique id?

From
M.Feldtmann@t-online.de (Marten Feldtmann)
Date:

Steve Wampler schrieb:
> 
> Yes, but...
> 
> (1) The application I have is composed of about 50 processes
>     running on 3 different OS/architectures (Linux/intel,
>     Solaris/sparc, and VxWorks/ppc).  The IDs I need must be
>     unique across all processes (I suppose one solution would
>     be to provide each ID with a unique prefix based on the
>     process that is running, but...)
We've build a document management system using this system 
and the clients all created ids are based on the a kind of high-low 
algorithm to create unique indices.
The indices are unique among all possible clients ... the 
number of clients does not matter. As I said before: better
than any hardwired solution.
You have two database queries among the normal lifetime
of a client to get the base information to create unique 
clients .. during the lifetime the ids are created offline
and they are garanteed to be unique.
Actually we're now in the process to build an object-oriented
PPD system and we use the same algorithm again.

> 
> (2) Some of these systems are real-time boxes that might get
>     rebooted at any moment, or might hang for hardware-related
>     reasons [I'd like to able to say that all of the processes
>     could detect imminent failure, but unfortunately, I can't].
>     So determining when a client "finishes" is not always possible,
>     which prevents (he claims) the above solution from claiming
>     ID uniqueness.
>
It does not matter until your machines do not reboot every second
but even then you may get along for ten or 20 years before you
ran out of indices.
> (where N might be < 1.0).  This, while still not guaranteeing
> uniqueness, would at least come pretty close...  It would still be
> nice to avoid having to VACUUM ANALYZE this table, though, and it
The base idea for all of it is simple:
The unique id is based on three integer numbers:
a) id-1 is a class id number (16 bit ?)b) id-2 is a global-session-number (32 bit): nc) id-3 is a local-session-number
(32bit): x
 
The id-3, id-2 and id-1 are converted to the base 36 and by this
they are converted to strings. The result unique id is about
15 characters long. (6+6+3)
We need a table to hold pairs of "global-id, local-id", this table
is initially empty.
When a client starts, it connects to the database, lockes this
table and now the following happens:
a) if the table is empty, the client uses (1,0) for its own   and stores (2,0) for the next client into the table.
b) if the table has ONE entry, the client removes the pair (n,x)   from the table and stores (n+1,0) into the table.
c) if the table has more than one entry, the client takes any   entry (normaly the one with the lowest n) from the
tableand   removes it.
 
d) the client unlocks the table
Now the client is able to create offline up to 2^32 new unique
identifiers. Increasing the numbers above and you get even more
possible values.
They create unique identifieres like (n,x), (n,x+1), ...
If the client reaches this limit during lifetime it does the 
above again.
If the client terminates, it writes it actual pair into this
table.
Ok, that's it.
If you want to have more information ... just contact me.


Marten