Thread: TODO item: GUID

TODO item: GUID

From
Gevik Babakhani
Date:
I have two questions regarding the GUID todo item...

1. Do we want to have a new datatype for that or just a macro like the
SERIAL type? 

create table bla
(  my_pk GUID  /* that is my_pk varchar(32) DEFAULT 'new_guid()' */
)

2. Didn't we have a contrib module doing the GUID?

Regards,
Gevik.





Re: TODO item: GUID

From
Martijn van Oosterhout
Date:
On Tue, Sep 05, 2006 at 11:29:40AM +0200, Gevik Babakhani wrote:
> I have two questions regarding the GUID todo item...
>
> 1. Do we want to have a new datatype for that or just a macro like the
> SERIAL type?

A new datatype. Just because someone has a guid column doesn't mean
they want it autogenerated. Provide a generation function (not
absolutly necessary) and you're done.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: TODO item: GUID

From
Gevik Babakhani
Date:
For developing the GUID datatype, I was wondering if I could use the
sample code from http://www.ietf.org/rfc/rfc4122.txt (hate to reinvent
the wheel)

The code has a copyright which says: "use and modify as you wish but
include the copyright notice with your code"

What are our rules is such matters?

Regards,
Gevik.




Re: TODO item: GUID

From
Martijn van Oosterhout
Date:
On Tue, Sep 05, 2006 at 02:29:15PM +0200, Gevik Babakhani wrote:
> For developing the GUID datatype, I was wondering if I could use the
> sample code from http://www.ietf.org/rfc/rfc4122.txt (hate to reinvent
> the wheel)
>
> The code has a copyright which says: "use and modify as you wish but
> include the copyright notice with your code"

Do you really want to copy the code verbatim? I mean, there's a lot of
stuff which would need quite a bit of massaging to get working in
postgres. I'd say just look at it, understand it, and then write
something that will work. The copyright won't matter then.

BTW, I seem to remember something about the stuff in the RFC not being
good for some reason, not unique enough or too predictable. Do you know
anything about that?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: TODO item: GUID

From
Gevik Babakhani
Date:
> Do you really want to copy the code verbatim? I mean, there's a lot of
> stuff which would need quite a bit of massaging to get working in
> postgres. I'd say just look at it, understand it, and then write
> something that will work. The copyright won't matter then.
> 

This is a better idea i must say. I will take a closer look at the code
then see which parts I can reuse. 

> BTW, I seem to remember something about the stuff in the RFC not being
> good for some reason, not unique enough or too predictable. Do you know
> anything about that?
> 

It was a privacy issue regarding GUID generation introduced by MS. The
version 1 (V1) of the algorithm was created based on the MAC which was
somehow back-traceable to the computer it was generated on I guess.
Version 4 was based on a better algorithm.

> Have a nice day,



Re: TODO item: GUID

From
"Aleksandar Dezelin"
Date:
Hello,

you just have to make random 128 bits and set version bits. And that's all.

This is the way this data type is implemented in Mono ( http://svn.myrealbox.com/source/trunk/mcs/class/corlib/System/Guid.cs).

Using time based GUIDs in database tables is not a good choice for performance reasons because they can not be indexed properly - every newly created time-based GUID is guaranteed to be larger than all previously created, so RDBMS engine must re balance b-tree every time a new GUID item is added to data table.

Sorry, for sending this message three times - problem with Gmail.

Cheers,
Aleksandar Dezelin

Re: TODO item: GUID

From
mark@mark.mielke.cc
Date:
On Sat, Sep 09, 2006 at 07:47:19PM +0200, Aleksandar Dezelin wrote:
> Hello,
> you just have to make random 128 bits and set version bits. And that's all.

> This is the way this data type is implemented in Mono
> (http://svn.myrealbox.com/source/trunk/mcs/class/corlib/System/Guid.cs).
> 
> Using time based GUIDs in database tables is not a good choice for
> performance reasons because they can not be indexed properly - every newly
> created time-based GUID is guaranteed to be larger than all previously
> created, so RDBMS engine must re balance b-tree every time a new GUID item
> is added to data table.
> 
> Sorry, for sending this message three times - problem with Gmail.

Depends how badly you want to skew the odds that a newly generated ID
is actually new, and how much you trust the distribution of your random
number generator.

There are several ways to generate a UUID - and I think it is wrong to
say that only one is the right way. Different applications choose
different generation routines. I *like* sorting by time, as it allows
the UUID to be used similar to sequence, leaving older, lesser accessed
UUIDs in the past. You and Mono might prefer something else. Some choose
random numbers over the MAC address as well - better? Depends on how big
your system is.

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/



Re: TODO item: GUID

From
Tom Lane
Date:
> On Sat, Sep 09, 2006 at 07:47:19PM +0200, Aleksandar Dezelin wrote:
>> Using time based GUIDs in database tables is not a good choice for
>> performance reasons because they can not be indexed properly - every newly
>> created time-based GUID is guaranteed to be larger than all previously
>> created, so RDBMS engine must re balance b-tree every time a new GUID item
>> is added to data table.

Only if you have a particularly bad b-tree implementation.  Do you also
not believe in indexing timestamp or serial columns?
        regards, tom lane


Re: TODO item: GUID

From
Thomas Hallgren
Date:
mark@mark.mielke.cc wrote:
>... I *like* sorting by time, as it allows
> the UUID to be used similar to sequence, leaving older, lesser accessed
> UUIDs in the past. 

and don't forget, an automatic timestamp of when a record is created might be useful for 
other purposes.

Regards,
Thomas Hallgren