Thread: guid/uuid datatype

guid/uuid datatype

From
Gevik Babakhani
Date:
Hi,

While ago (sep-2006) I sent a patch for the UUID datatype, Did anyone
have time to review it yet?

Here it is again :)

Regards,
Gevik


Attachment

Re: guid/uuid datatype

From
Neil Conway
Date:
On Fri, 2007-01-19 at 10:25 +0100, Gevik Babakhani wrote:
> While ago (sep-2006) I sent a patch for the UUID datatype, Did anyone
> have time to review it yet?

I confess I haven't followed the discussion around this patch, but is
there a compelling reason to include this in the backend proper, rather
than contrib/?

-Neil



Re: guid/uuid datatype

From
Gevik Babakhani
Date:
> I confess I haven't followed the discussion around this patch, but is
> there a compelling reason to include this in the backend proper, rather
> than contrib/?

AFAIK, It is/was part of the TODO for the core.


Re: guid/uuid datatype

From
Neil Conway
Date:
On Sat, 2007-01-20 at 00:21 +0100, Gevik Babakhani wrote:
> AFAIK, It is/was part of the TODO for the core.

Well, I don't have a strong opinion either way, but I think it should be
given some thought.

As far as the code, looks pretty good. A few minor comments:

* varchar_uuid() should be named uuid_varchar(), for consistency with
the other function names. In fact, uuid_text() and varchar_uuid() are
essentially identical, so they should be refactored. The fmgr interface
macros can stay, I guess.

* most of uuid.h can be gotten rid of: the SQL-callable functions are
already declared in builtins.h, and most of the other declarations
should be moved to uuid.c and made local to that file.

* needs documentation

-Neil



Re: guid/uuid datatype

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Sat, 2007-01-20 at 00:21 +0100, Gevik Babakhani wrote:
> > AFAIK, It is/was part of the TODO for the core.
>
> Well, I don't have a strong opinion either way, but I think it should be
> given some thought.
>
> As far as the code, looks pretty good. A few minor comments:
>
> * varchar_uuid() should be named uuid_varchar(), for consistency with
> the other function names. In fact, uuid_text() and varchar_uuid() are
> essentially identical, so they should be refactored. The fmgr interface
> macros can stay, I guess.
>
> * most of uuid.h can be gotten rid of: the SQL-callable functions are
> already declared in builtins.h, and most of the other declarations
> should be moved to uuid.c and made local to that file.
>
> * needs documentation

I think having it in core makes the most sense.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: guid/uuid datatype

From
Neil Conway
Date:
On Fri, 2007-01-19 at 19:19 -0500, Bruce Momjian wrote:
> I think having it in core makes the most sense.

Why is that?

One question that comes to mind is how the submitted patch compares in
functionality to the other implementations of the UUID concept for
PostgreSQL, such as OSSP uuid (which implements a Postgres UDT in its
CVS version), or the UUID project on gborg.

-Neil



Re: guid/uuid datatype

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Fri, 2007-01-19 at 19:19 -0500, Bruce Momjian wrote:
> > I think having it in core makes the most sense.
>
> Why is that?

I should have been clearer.  I think having in the main server or
/contrib makes sense.  Having data types external to our source tree
doesn't seem to work too well because of changes in our API from time to
time.  I think the UUID type has enough usage to warrant us maintaining
it.

> One question that comes to mind is how the submitted patch compares in
> functionality to the other implementations of the UUID concept for
> PostgreSQL, such as OSSP uuid (which implements a Postgres UDT in its
> CVS version), or the UUID project on gborg.

No idea.  They all have to be researched, and if we find they all have
different strenghts, I am afraid we will have to keep them all external.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: guid/uuid datatype

From
Peter Eisentraut
Date:
Bruce Momjian wrote:
> I should have been clearer.  I think having in the main server or
> /contrib makes sense.  Having data types external to our source tree
> doesn't seem to work too well because of changes in our API from time
> to time.

When has the API for data types ever changed?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: guid/uuid datatype

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > I should have been clearer.  I think having in the main server or
> > /contrib makes sense.  Having data types external to our source tree
> > doesn't seem to work too well because of changes in our API from time
> > to time.
>
> When has the API for data types ever changed?

The API doesn't change, but the way to do things inside the type
functions does changes sometimes.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: guid/uuid datatype

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Peter Eisentraut wrote:
>> When has the API for data types ever changed?

> The API doesn't change, but the way to do things inside the type
> functions does changes sometimes.

We've always done our best not to break user-defined datatypes without
need.  uuid doesn't seem to need any hooks into the core system that
would make it any more likely to break than anything else.

Per previous discussion, the main problem with a uuid type is the
new-uuid generator function, which tends to involve a bunch of
not-so-portable assumptions and code.  If we accept a uuid type in
either core or contrib, all of a sudden those portability issues are
our problem.  I'd rather not deal with that.

I'd be willing to accept a core uuid type sans generator function,
but is that really all that useful?

            regards, tom lane

Re: guid/uuid datatype

From
"Joshua D. Drake"
Date:
> Per previous discussion, the main problem with a uuid type is the
> new-uuid generator function, which tends to involve a bunch of
> not-so-portable assumptions and code.  If we accept a uuid type in
> either core or contrib, all of a sudden those portability issues are
> our problem.  I'd rather not deal with that.
>
> I'd be willing to accept a core uuid type sans generator function,
> but is that really all that useful?

I think it would. There are plenty of client side libraries that
generate uuid, at least we could provide a native type for them to use.
A generator would be great too of course, but if they really need one
they could use one of the pl languages for it.

Sincerely,

Joshua D. Drake


>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: guid/uuid datatype

From
"Joshua D. Drake"
Date:
Joshua D. Drake wrote:
>> Per previous discussion, the main problem with a uuid type is the
>> new-uuid generator function, which tends to involve a bunch of
>> not-so-portable assumptions and code.  If we accept a uuid type in
>> either core or contrib, all of a sudden those portability issues are
>> our problem.  I'd rather not deal with that.
>>
>> I'd be willing to accept a core uuid type sans generator function,
>> but is that really all that useful?
>
> I think it would. There are plenty of client side libraries that
> generate uuid, at least we could provide a native type for them to use.
> A generator would be great too of course, but if they really need one
> they could use one of the pl languages for it.

As a follow up to this both Java and Python have uuid generators that
are built in. Which as we all know are both extremely portable languages.

Sincerely,

Joshua D. Drake

>
> Sincerely,
>
> Joshua D. Drake
>
>
>>             regards, tom lane
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>
>
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: guid/uuid datatype

From
Gevik Babakhani
Date:
> I'd be willing to accept a core uuid type sans generator function,
> but is that really all that useful?
>
 This is also a point I remember from the last discussions. To not to
include the generator in the core. The generation of the uuid is then
going to be on the client side.

The uuid type is very useful, especially when migrating from other
systems to pg  (ms->pg or syb->pg).

Regards,
Gevik.



Re: guid/uuid datatype

From
Magnus Hagander
Date:
Gevik Babakhani wrote:
>> I'd be willing to accept a core uuid type sans generator function,
>> but is that really all that useful?
>>
>  This is also a point I remember from the last discussions. To not to
> include the generator in the core. The generation of the uuid is then
> going to be on the client side.
>
> The uuid type is very useful, especially when migrating from other
> systems to pg  (ms->pg or syb->pg).

But does it really help if you don't have the generator?

I don't use UUIDs much myself, but I think in all cases I've seen that
use the uuid type in SQL Server they're also using the generator function.
Those that just store UUIDs in the database often just uses varchar - in
order to be more portable, I guess.

Not saying it wouldn't be good to have uuid for portability, I'm just a
bit unsure of how much use it is without a generator function...

//Magnus

Re: guid/uuid datatype

From
Gevik Babakhani
Date:
>
> But does it really help if you don't have the generator?
>
> I don't use UUIDs much myself, but I think in all cases I've seen that
> use the uuid type in SQL Server they're also using the generator function.
> Those that just store UUIDs in the database often just uses varchar - in
> order to be more portable, I guess.
>

There could be many algorithms to generate a guid. I guess we will get
into a big debate on that, which is not much useful i guess (seeing the
posts last year).

In most cases I have seen the guid is generated by the client. In case
of M$ Sql is also can be generated on the server but, in our case we
generate the guids ourselves because with our algorithm we can trace the
guid back to where it exactly was originated. (app requirement)

One thing is for sure, having varchar to store guids ( varchar(32) ) is
not that efficient.

Regards,
Gevik.




Re: guid/uuid datatype

From
Alvaro Herrera
Date:
Magnus Hagander wrote:
> Gevik Babakhani wrote:
> >> I'd be willing to accept a core uuid type sans generator function,
> >> but is that really all that useful?
> >>
> >  This is also a point I remember from the last discussions. To not to
> > include the generator in the core. The generation of the uuid is then
> > going to be on the client side.
> >
> > The uuid type is very useful, especially when migrating from other
> > systems to pg  (ms->pg or syb->pg).
>
> But does it really help if you don't have the generator?

We could have all the type code in core, and the generator in contrib or
pgfoundry.  That way the user can choose the most appropriate generator,
even if it's platform-specific.  Or he can choose to use a client-side
generator.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: guid/uuid datatype

From
Stefan Kaltenbrunner
Date:
Alvaro Herrera wrote:
> Magnus Hagander wrote:
>> Gevik Babakhani wrote:
>>>> I'd be willing to accept a core uuid type sans generator function,
>>>> but is that really all that useful?
>>>>
>>>  This is also a point I remember from the last discussions. To not to
>>> include the generator in the core. The generation of the uuid is then
>>> going to be on the client side.
>>>
>>> The uuid type is very useful, especially when migrating from other
>>> systems to pg  (ms->pg or syb->pg).
>> But does it really help if you don't have the generator?
>
> We could have all the type code in core, and the generator in contrib or
> pgfoundry.  That way the user can choose the most appropriate generator,
> even if it's platform-specific.  Or he can choose to use a client-side
> generator.

that seems like a good compromise - have the type in core and generators
in contrib/pgfoundry. In one or two releases we might see some feedback
on the portability and how people use those and could decide on leaving
it that way or move the generators into core as well.


Stefan

Re: guid/uuid datatype

From
Peter Eisentraut
Date:
Gevik Babakhani wrote:
> There could be many algorithms to generate a guid. I guess we will
> get into a big debate on that, which is not much useful i guess
> (seeing the posts last year).

There are a handful of standardized or established algorithms, so it
doesn't hurt to provide them all.  Compare pgcrypto -- certainly no one
needs all those encryption algorithms, but we offer them.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: guid/uuid datatype

From
Gevik Babakhani
Date:
what is the next step now? is there going to be review by a committer?
if so, please note that the OIDs in the patch have to be changed.

Regards,
Gevik



Re: guid/uuid datatype

From
Neil Conway
Date:
On Fri, 2007-01-19 at 23:00 -0500, Tom Lane wrote:
> Per previous discussion, the main problem with a uuid type is the
> new-uuid generator function, which tends to involve a bunch of
> not-so-portable assumptions and code.

RFC 4122 specifies several ways of generating UUIDs:

* via the computer's MAC address and the time since the Gregorian epoch
in nanoseconds

* via MD5 or SHA1 hashing of a given string of URL, or similar
identifier

* via a PSRNG

Only the first of these presents any portability concerns, AFAICS.

-Neil



Re: guid/uuid datatype

From
Neil Conway
Date:
On Sat, 2007-01-20 at 14:56 +0100, Gevik Babakhani wrote:
> what is the next step now? is there going to be review by a committer?
> if so, please note that the OIDs in the patch have to be changed.

I already suggested a few things you could improve in the patch. If this
discussion concludes that the patch should be included in the core
backend and you submit a revised patch, I'd be happy to review and apply
it.

-Neil



Re: guid/uuid datatype

From
Gevik Babakhani
Date:
> I already suggested a few things you could improve in the patch. If this
> discussion concludes that the patch should be included in the core
> backend and you submit a revised patch, I'd be happy to review and apply
> it.

So.. do we agree for uuid to be included in the core?

If so.. I will change the assigned OIDs in the patch to match the
current source-tree and update the code with the suggestions provided by
Neil.

Are we okay on this?

Regards,
Gevik.




Re: guid/uuid datatype

From
Neil Conway
Date:
On Sun, 2007-01-21 at 00:17 +0100, Gevik Babakhani wrote:
> So.. do we agree for uuid to be included in the core?

I'd be curious to know the degree to which the proposed patch is
consistent with RFC 4122, which AFAIK is the most recent relevant
standard.

With regard to functions for generating UUIDs, I think we should at
least include the methods specified by RFC 4122 that can be implemented
without too many unportable assumptions. I believe that means MD5 & SHA1
hashing of an arbitrary identifier, and UUIDs generated via a PSRNG.

-Neil



Re: guid/uuid datatype

From
Stefan Kaltenbrunner
Date:
Neil Conway wrote:
> On Sun, 2007-01-21 at 00:17 +0100, Gevik Babakhani wrote:
>> So.. do we agree for uuid to be included in the core?
>
> I'd be curious to know the degree to which the proposed patch is
> consistent with RFC 4122, which AFAIK is the most recent relevant
> standard.
>
> With regard to functions for generating UUIDs, I think we should at
> least include the methods specified by RFC 4122 that can be implemented
> without too many unportable assumptions. I believe that means MD5 & SHA1
> hashing of an arbitrary identifier, and UUIDs generated via a PSRNG.

I thought the consensus was to provide the only atatype initially and
look into providing the generator functions later or via an external
project (pgfoundry or contrib/).


Stefan

Re: guid/uuid datatype

From
Peter Eisentraut
Date:
Gevik Babakhani wrote:
> So.. do we agree for uuid to be included in the core?

I suggest that you read the discussion in the tsearch thread about
figuring out how to make contrib modules more attractive.  I don't see
a reason why uuid has to be in the core, but I do see that there needs
to be some centrally organized consolidation of the various existing
attempts under a label of officiality.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: guid/uuid datatype

From
Neil Conway
Date:
On Thu, 2007-01-25 at 17:57 +0100, Stefan Kaltenbrunner wrote:
> I thought the consensus was to provide the only atatype initially and
> look into providing the generator functions later or via an external
> project (pgfoundry or contrib/).

I don't think distributing the (portable) generator functions separately
makes a lot of sense. For the generation methods that just depend on
md5() or random(), we may as well include them in the backend if we're
going to include the rest of the UUID stuff.

The MAC-based generator function could also be included in the backend,
actually: it just needs to take an argument of type "macaddr". It would
then be up to the user (and/or various pgfoundry and contrib/ modules)
to find a way to determine the local machine's MAC address, which
presumably can't be done reliably in a portable fashion.

-Neil



Re: guid/uuid datatype

From
"Joshua D. Drake"
Date:
Peter Eisentraut wrote:
> Gevik Babakhani wrote:
>> So.. do we agree for uuid to be included in the core?
>
> I suggest that you read the discussion in the tsearch thread about
> figuring out how to make contrib modules more attractive.  I don't see
> a reason why uuid has to be in the core, but I do see that there needs
> to be some centrally organized consolidation of the various existing
> attempts under a label of officiality.

I think it would be more important to determine how we can get UUID in
core. It is a known and accepted way of doing things in the marketplace
and professional communities.

It is time we stop fighting features for the sake of fighting features.

Sincerely,

Joshua D. Drake


>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: guid/uuid datatype

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Thu, 2007-01-25 at 17:57 +0100, Stefan Kaltenbrunner wrote:
> > I thought the consensus was to provide the only atatype initially and
> > look into providing the generator functions later or via an external
> > project (pgfoundry or contrib/).
>
> I don't think distributing the (portable) generator functions separately
> makes a lot of sense. For the generation methods that just depend on
> md5() or random(), we may as well include them in the backend if we're
> going to include the rest of the UUID stuff.
>
> The MAC-based generator function could also be included in the backend,
> actually: it just needs to take an argument of type "macaddr". It would
> then be up to the user (and/or various pgfoundry and contrib/ modules)
> to find a way to determine the local machine's MAC address, which
> presumably can't be done reliably in a portable fashion.

I assume we could just allow the MAC address or some unique idenfier to
be specified in postgesql.conf.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: guid/uuid datatype

From
"Joshua D. Drake"
Date:
>> The MAC-based generator function could also be included in the backend,
>> actually: it just needs to take an argument of type "macaddr". It would
>> then be up to the user (and/or various pgfoundry and contrib/ modules)
>> to find a way to determine the local machine's MAC address, which
>> presumably can't be done reliably in a portable fashion.
>
> I assume we could just allow the MAC address or some unique idenfier to
> be specified in postgesql.conf.

Well at that point why don't we allow it to be specified per database?

ALTER DATABASE foo SET uuid_salt =

:)

That would be pretty cool, but I am sure most will shoot that down in
flames :)

Sincerely,

Joshua D. Drake


>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: guid/uuid datatype

From
Gevik Babakhani
Date:
> I thought the consensus was to provide the only atatype initially and
> look into providing the generator functions later or via an external
> project (pgfoundry or contrib/).

This was my understanding too... to include the uuid in the core and let
the actual value be generated elsewhere...(client or separate
project)...

I do not think having a uuid datatype as contrib module separately will
do us much good. All professional dbs support this as built in. So why
shouldn't we...

Regards,
Gevik.



Re: guid/uuid datatype

From
Jim Nasby
Date:
On Jan 25, 2007, at 11:13 AM, Peter Eisentraut wrote:
> I suggest that you read the discussion in the tsearch thread about
> figuring out how to make contrib modules more attractive.  I don't see
> a reason why uuid has to be in the core, but I do see that there needs
> to be some centrally organized consolidation of the various existing
> attempts under a label of officiality.

Following that logic we should remove all data types that aren't
specified in ANSI.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



Re: guid/uuid datatype

From
Peter Eisentraut
Date:
Jim Nasby wrote:
> Following that logic we should remove all data types that aren't
> specified in ANSI.

Sure, if we were to arrive at some acceptable implementation of official
modules, that would make sense in some cases.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/