Thread: UUID/GUID information

UUID/GUID information

From
David Busby
Date:
Dear List,
    I'm trying to migrate my MS-SQL(shit) to Postgre.  My database
depends on having a uniqueidentifier for all objects stored. (20 or so
tables of these unique objects).  In MS-SQL I can use this datatype called
"uniqueidentifier" to accomplish this.  What would be a similar solution in
Postgre?  I've looked on through the MAN pages and also scoured the net for
this info...I don't necessarly need a UUID like the MS one but some unique
way to identifiy each object.

Thanks in advance for help

/B

Re: UUID/GUID information

From
Keary Suska
Date:
on 5/30/02 12:05 PM, Busby@pnts.com purportedly said:

> Dear List,
> I'm trying to migrate my MS-SQL(shit) to Postgre.  My database
> depends on having a uniqueidentifier for all objects stored. (20 or so
> tables of these unique objects).  In MS-SQL I can use this datatype called
> "uniqueidentifier" to accomplish this.  What would be a similar solution in
> Postgre?  I've looked on through the MAN pages and also scoured the net for
> this info...I don't necessarly need a UUID like the MS one but some unique
> way to identifiy each object.
>
> Thanks in advance for help

Postgres has what it calls an "OID". This is a unique identifier for every
object. If you choose to use this, be sure to read all the caveats relating
to external use of OIDs. For instance, if you re-create the database (say
from a backup or dump), the IODs will change. IIRC, you can control this by
dumping/restoring OIDs in the executables.

Keary Suska
Esoteritech, Inc.
"Leveraging Open Source for a better Internet"


Re: UUID/GUID information

From
Josh Berkus
Date:
David,

>     I'm trying to migrate my MS-SQL(shit) to Postgre.  My database
> depends on having a uniqueidentifier for all objects stored. (20 or so
> tables of these unique objects).  In MS-SQL I can use this datatype called
> "uniqueidentifier" to accomplish this.  What would be a similar solution in
> Postgre?  I've looked on through the MAN pages and also scoured the net for
> this info...I don't necessarly need a UUID like the MS one but some unique
> way to identifiy each object.

The best way to do this in PostgreSQL is to set up an independant sequence:

CREATE SEQUENCE universal_sq;

Then reference this in each table definition:
CREATE TABLE blah (
    UUID INT4 NOT NULL DEFAULT NEXTVAL('universal_sq'),
    etc ...
);

CREATE TABLE neh (
    UUID INT4 NOT NULL DEFAULT NEXTVAL('universal_sq'),
    etc ...
);

You can even use the UUID as the primary key this way.  Postgres Sequence
manager insures that no sequence number is used twice, even in the event of
aborted transactions.  See the docs on sequences for more info.

Please note that special measures need to be taken if you are likely to exceed
the limits of INT4 (2.4 billion objects).

--
-Josh Berkus


Re: UUID/GUID information

From
David Busby
Date:
Dear List,
    Thank you for the prompt responses...I read the MAN on OID and also
on SERIAL...the benefit of using the MS-UUID is that the identifier created
is guarenteed to be unique in spacetime.  OID and SERIAL are not...they are
only guarenteed to be unique in a database.

Perhaps I should have mentioned that I'm building a multi-database solution
(20+ databases) in which all 20+ DBs must use the same identifier across
databases for some objects (ex: Automobile Brands) but their own identifier
for their own data (ex: Accouts/Clients) this way when the child database
publish to the master there is no possiblity of some object having the same
identifer as another...and the object identifier can stay the same across
all 20+ DBs.

The GUID from Micro$oft is formatted like
{01234567-89AB-CDEF-0123-456789ABCDEF} it represents a 16 byte number that
is again unique in spacetime.  Is there PostgreSQL solution for something
like that or will I have to come up with my own.

/B

-----Original Message-----
From: Keary Suska [mailto:hierophant@pcisys.net]
Sent: Thursday, May 30, 2002 11:32
To: Postgres-PHP
Subject: Re: [PHP] UUID/GUID information


on 5/30/02 12:05 PM, Busby@pnts.com purportedly said:

> Dear List,
> I'm trying to migrate my MS-SQL(shit) to Postgre.  My database depends
> on having a uniqueidentifier for all objects stored. (20 or so tables
> of these unique objects).  In MS-SQL I can use this datatype called
> "uniqueidentifier" to accomplish this.  What would be a similar
> solution in Postgre?  I've looked on through the MAN pages and also
> scoured the net for this info...I don't necessarly need a UUID like
> the MS one but some unique way to identifiy each object.
>
> Thanks in advance for help

Postgres has what it calls an "OID". This is a unique identifier for every
object. If you choose to use this, be sure to read all the caveats relating
to external use of OIDs. For instance, if you re-create the database (say
from a backup or dump), the IODs will change. IIRC, you can control this by
dumping/restoring OIDs in the executables.

Keary Suska
Esoteritech, Inc.
"Leveraging Open Source for a better Internet"


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your message
can get through to the mailing list cleanly

Re: UUID/GUID information

From
Keary Suska
Date:
on 5/30/02 2:05 PM, Busby@pnts.com purportedly said:

> Dear List,
> Thank you for the prompt responses...I read the MAN on OID and also
> on SERIAL...the benefit of using the MS-UUID is that the identifier created
> is guarenteed to be unique in spacetime.  OID and SERIAL are not...they are
> only guarenteed to be unique in a database.
>
> Perhaps I should have mentioned that I'm building a multi-database solution
> (20+ databases) in which all 20+ DBs must use the same identifier across
> databases for some objects (ex: Automobile Brands) but their own identifier
> for their own data (ex: Accouts/Clients) this way when the child database
> publish to the master there is no possiblity of some object having the same
> identifer as another...and the object identifier can stay the same across
> all 20+ DBs.
>
> The GUID from Micro$oft is formatted like
> {01234567-89AB-CDEF-0123-456789ABCDEF} it represents a 16 byte number that
> is again unique in spacetime.  Is there PostgreSQL solution for something
> like that or will I have to come up with my own.

AFAIK Postgres doesn't have such a feature built-in, so you would have to do
it yourself. If Db access is entirely web-based, and you use Apache,
mod_unique may do what you want, and it is supposed to be unique throughout
time. However, it may not be unique across multiple Apache installations, if
you have such an environment.

To use Postgres alone, you would likely need a particular database that
"tracks" unique ids. It adds a bit of overhead, and you face parallel access
issues. You can mimic a 16-byte number using 4 INT4 fields (unless your
platform supports 64 bit numbers), but you would have to watch for overflow,
which can get confusing since Postgres doesn't have the notion of unsigned
integers. Fortunately, you can set up triggers/stored procedures that will
handle all of this automatically.

Keary Suska
Esoteritech, Inc.
"Leveraging Open Source for a better Internet"


Re: UUID/GUID information

From
Josh Berkus
Date:
David,

> Perhaps I should have mentioned that I'm building a multi-database solution
> (20+ databases) in which all 20+ DBs must use the same identifier across
> databases for some objects (ex: Automobile Brands) but their own identifier
> for their own data (ex: Accouts/Clients) this way when the child database
> publish to the master there is no possiblity of some object having the same
> identifer as another...and the object identifier can stay the same across
> all 20+ DBs.
>
> The GUID from Micro$oft is formatted like
> {01234567-89AB-CDEF-0123-456789ABCDEF} it represents a 16 byte number that
> is again unique in spacetime.  Is there PostgreSQL solution for something
> like that or will I have to come up with my own.

In that case, you should have the budget for some programming, yes?

The answer is quite simple:
1. You set up  a universal sequence as I described.
2. You give each server its own 4-byte Server ID, and put it in the table
        server_id.  Use whatever number you want; I might suggest something
        based on the IP address of the machine (though unfortunately IP
addresses
        are 4 bytes unsigned, so you can't use them directly).
3. You create a function as follows:

CREATE FUNCTION unique_id () RETURNS INT8 AS '
SELECT ((server_id.server_id::INT8 * (2^31 - 1)::INT8) +
NEXTVAL('universal_sq'))
FROM server_id; '
LANGUAGE 'sql';

(Somebody correct my math if I'm off, here)

Alternately, you could use a random 4-byte number instead of the server_id,
which wouldn't be perfect but would give you only about a 20 in 2.4 billion
chance of a conflict.


--
-Josh Berkus

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 621-2533
    and non-profit organizations.     San Francisco


Re: UUID/GUID information

From
Andrew McMillan
Date:
On Fri, 2002-05-31 at 08:05, David Busby wrote:
>
> The GUID from Micro$oft is formatted like
> {01234567-89AB-CDEF-0123-456789ABCDEF} it represents a 16 byte number that
> is again unique in spacetime.  Is there PostgreSQL solution for something
> like that or will I have to come up with my own.

There is an RFC defining GUID and UUID, and it isn't a major job to
generate them yourself.  If you implement a function to do so you may
want to contribute it back to the PostgreSQL community.

The basic approach uses IP address, date and time and some other stuff I
forget right now.

GUID's are actually pretty neat, because they are sortable, but still
unique, I believe.

It sounds like a straight timestamp + IP address would be sufficient for
you.  In a similar multi-system problem I am working on I actually just
use a two-field key with a serial and a machine name.  Using a machine
name explicitly, and splitting it into a separate field, actually offers
the advantage of making it clearer where the transaction came from, and
consequently how my replication model works.

Steer well clear of OIDs - they are not necessarily persistent over
backup / restore, and they offer little real utility in PostgreSQL these
days being something of an appendix from it's time-travelling days. I
actually disable them on most user tables when running under 7.2.1 .

Regards,
                    Andrew.
--
--------------------------------------------------------------------
Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington
WEB: http://catalyst.net.nz/        PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201    MOB: +64(21)635-694    OFFICE: +64(4)499-2267
       Are you enrolled at http://schoolreunions.co.nz/ yet?