Re: [PATCHES] Patch for UUID datatype (beta) - Mailing list pgsql-hackers

From mark@mark.mielke.cc
Subject Re: [PATCHES] Patch for UUID datatype (beta)
Date
Msg-id 20060918162316.GB31239@mark.mielke.cc
Whole thread Raw
In response to Re: [PATCHES] Patch for UUID datatype (beta)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCHES] Patch for UUID datatype (beta)  ("Jim C. Nasby" <jimn@enterprisedb.com>)
Re: [PATCHES] Patch for UUID datatype (beta)  (Markus Schaber <schabi@logix-tt.com>)
List pgsql-hackers
On Mon, Sep 18, 2006 at 10:33:22AM -0400, Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> > Isn't guaranteed uniqueness the very attribute that's expected? AFAIK
> > there's a commonly accepted algorithm providing this.
> Anyone who thinks UUIDs are guaranteed unique has been drinking too much
> of the kool-aid.  They're at best probably unique.  Some generator
> algorithms might make it more probable than others, but you simply
> cannot "guarantee" it for UUIDs generated on noncommunicating machines.

The versions that include a MAC address, time, and serial number for
the machine come pretty close, presuming that the user has not
overwritten the MAC address with something else. It's unique at
manufacturing time. If the generation is performed from a library
with the same state, on the same machine, on the off chance that you
do request multiple generations at the same exact time (from my
experience, this is already unlikely) the serial number should be
bumped for that time.

So yeah - if you set your MAC address, or if your machine time is ever
set back, or if you assume a serial number of 0 each time (generation
routine isn't shared among processes on the system), you can get overlap.
All of these can be controlled, making it possible to eliminate overlap.

> One of the big reasons that I'm hesitant to put a UUID generation
> function into core is the knowledge that none of them are or can be
> perfect ... so people might need different ones depending on local
> conditions.  I'm inclined to think that a reasonable setup would put
> the datatype (with input, output, comparison and indexing support)
> into core, but provide a generation function as a contrib module,
> making it easily replaceable.

I have UUID generation in core in my current implementation. In the
last year that I've been using it, I have already chosen twice to
generate UUIDs from my calling program. I find it faster, as it avoids
have to call out to PostgreSQL twice. Once to generate the UUID, and
once to insert the row using it. I have no strong need for UUID
generation to be in core, and believe there does exist strong reasons
not to. Performance is better when not in core. Portability of
PostgreSQL is better when not in core. Ability to control how UUID is
defined is better when not in control.

The only thing an in-core version provides is convenience for those
that do not have easy access to a UUID generation library. I don't
care for that convenience.

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


pgsql-hackers by date:

Previous
From: Thomas Hallgren
Date:
Subject: Re: UUID/GUID discussion leading to request for hexstring bytea?
Next
From: Andrew Dunstan
Date:
Subject: Re: OID conflicts