Thread: guid/uuid datatype
Hi, While ago (sep-2006) I sent a patch for the UUID datatype, Did anyone have time to review it yet? Here it is again :) Regards, Gevik
Attachment
On Fri, 2007-01-19 at 10:25 +0100, Gevik Babakhani wrote: > While ago (sep-2006) I sent a patch for the UUID datatype, Did anyone > have time to review it yet? I confess I haven't followed the discussion around this patch, but is there a compelling reason to include this in the backend proper, rather than contrib/? -Neil
> I confess I haven't followed the discussion around this patch, but is > there a compelling reason to include this in the backend proper, rather > than contrib/? AFAIK, It is/was part of the TODO for the core.
On Sat, 2007-01-20 at 00:21 +0100, Gevik Babakhani wrote: > AFAIK, It is/was part of the TODO for the core. Well, I don't have a strong opinion either way, but I think it should be given some thought. As far as the code, looks pretty good. A few minor comments: * varchar_uuid() should be named uuid_varchar(), for consistency with the other function names. In fact, uuid_text() and varchar_uuid() are essentially identical, so they should be refactored. The fmgr interface macros can stay, I guess. * most of uuid.h can be gotten rid of: the SQL-callable functions are already declared in builtins.h, and most of the other declarations should be moved to uuid.c and made local to that file. * needs documentation -Neil
Neil Conway wrote: > On Sat, 2007-01-20 at 00:21 +0100, Gevik Babakhani wrote: > > AFAIK, It is/was part of the TODO for the core. > > Well, I don't have a strong opinion either way, but I think it should be > given some thought. > > As far as the code, looks pretty good. A few minor comments: > > * varchar_uuid() should be named uuid_varchar(), for consistency with > the other function names. In fact, uuid_text() and varchar_uuid() are > essentially identical, so they should be refactored. The fmgr interface > macros can stay, I guess. > > * most of uuid.h can be gotten rid of: the SQL-callable functions are > already declared in builtins.h, and most of the other declarations > should be moved to uuid.c and made local to that file. > > * needs documentation I think having it in core makes the most sense. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Fri, 2007-01-19 at 19:19 -0500, Bruce Momjian wrote: > I think having it in core makes the most sense. Why is that? One question that comes to mind is how the submitted patch compares in functionality to the other implementations of the UUID concept for PostgreSQL, such as OSSP uuid (which implements a Postgres UDT in its CVS version), or the UUID project on gborg. -Neil
Neil Conway wrote: > On Fri, 2007-01-19 at 19:19 -0500, Bruce Momjian wrote: > > I think having it in core makes the most sense. > > Why is that? I should have been clearer. I think having in the main server or /contrib makes sense. Having data types external to our source tree doesn't seem to work too well because of changes in our API from time to time. I think the UUID type has enough usage to warrant us maintaining it. > One question that comes to mind is how the submitted patch compares in > functionality to the other implementations of the UUID concept for > PostgreSQL, such as OSSP uuid (which implements a Postgres UDT in its > CVS version), or the UUID project on gborg. No idea. They all have to be researched, and if we find they all have different strenghts, I am afraid we will have to keep them all external. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > I should have been clearer. I think having in the main server or > /contrib makes sense. Having data types external to our source tree > doesn't seem to work too well because of changes in our API from time > to time. When has the API for data types ever changed? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut wrote: > Bruce Momjian wrote: > > I should have been clearer. I think having in the main server or > > /contrib makes sense. Having data types external to our source tree > > doesn't seem to work too well because of changes in our API from time > > to time. > > When has the API for data types ever changed? The API doesn't change, but the way to do things inside the type functions does changes sometimes. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Peter Eisentraut wrote: >> When has the API for data types ever changed? > The API doesn't change, but the way to do things inside the type > functions does changes sometimes. We've always done our best not to break user-defined datatypes without need. uuid doesn't seem to need any hooks into the core system that would make it any more likely to break than anything else. Per previous discussion, the main problem with a uuid type is the new-uuid generator function, which tends to involve a bunch of not-so-portable assumptions and code. If we accept a uuid type in either core or contrib, all of a sudden those portability issues are our problem. I'd rather not deal with that. I'd be willing to accept a core uuid type sans generator function, but is that really all that useful? regards, tom lane
> Per previous discussion, the main problem with a uuid type is the > new-uuid generator function, which tends to involve a bunch of > not-so-portable assumptions and code. If we accept a uuid type in > either core or contrib, all of a sudden those portability issues are > our problem. I'd rather not deal with that. > > I'd be willing to accept a core uuid type sans generator function, > but is that really all that useful? I think it would. There are plenty of client side libraries that generate uuid, at least we could provide a native type for them to use. A generator would be great too of course, but if they really need one they could use one of the pl languages for it. Sincerely, Joshua D. Drake > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Joshua D. Drake wrote: >> Per previous discussion, the main problem with a uuid type is the >> new-uuid generator function, which tends to involve a bunch of >> not-so-portable assumptions and code. If we accept a uuid type in >> either core or contrib, all of a sudden those portability issues are >> our problem. I'd rather not deal with that. >> >> I'd be willing to accept a core uuid type sans generator function, >> but is that really all that useful? > > I think it would. There are plenty of client side libraries that > generate uuid, at least we could provide a native type for them to use. > A generator would be great too of course, but if they really need one > they could use one of the pl languages for it. As a follow up to this both Java and Python have uuid generators that are built in. Which as we all know are both extremely portable languages. Sincerely, Joshua D. Drake > > Sincerely, > > Joshua D. Drake > > >> regards, tom lane >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> > > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
> I'd be willing to accept a core uuid type sans generator function, > but is that really all that useful? > This is also a point I remember from the last discussions. To not to include the generator in the core. The generation of the uuid is then going to be on the client side. The uuid type is very useful, especially when migrating from other systems to pg (ms->pg or syb->pg). Regards, Gevik.
Gevik Babakhani wrote: >> I'd be willing to accept a core uuid type sans generator function, >> but is that really all that useful? >> > This is also a point I remember from the last discussions. To not to > include the generator in the core. The generation of the uuid is then > going to be on the client side. > > The uuid type is very useful, especially when migrating from other > systems to pg (ms->pg or syb->pg). But does it really help if you don't have the generator? I don't use UUIDs much myself, but I think in all cases I've seen that use the uuid type in SQL Server they're also using the generator function. Those that just store UUIDs in the database often just uses varchar - in order to be more portable, I guess. Not saying it wouldn't be good to have uuid for portability, I'm just a bit unsure of how much use it is without a generator function... //Magnus
> > But does it really help if you don't have the generator? > > I don't use UUIDs much myself, but I think in all cases I've seen that > use the uuid type in SQL Server they're also using the generator function. > Those that just store UUIDs in the database often just uses varchar - in > order to be more portable, I guess. > There could be many algorithms to generate a guid. I guess we will get into a big debate on that, which is not much useful i guess (seeing the posts last year). In most cases I have seen the guid is generated by the client. In case of M$ Sql is also can be generated on the server but, in our case we generate the guids ourselves because with our algorithm we can trace the guid back to where it exactly was originated. (app requirement) One thing is for sure, having varchar to store guids ( varchar(32) ) is not that efficient. Regards, Gevik.
Magnus Hagander wrote: > Gevik Babakhani wrote: > >> I'd be willing to accept a core uuid type sans generator function, > >> but is that really all that useful? > >> > > This is also a point I remember from the last discussions. To not to > > include the generator in the core. The generation of the uuid is then > > going to be on the client side. > > > > The uuid type is very useful, especially when migrating from other > > systems to pg (ms->pg or syb->pg). > > But does it really help if you don't have the generator? We could have all the type code in core, and the generator in contrib or pgfoundry. That way the user can choose the most appropriate generator, even if it's platform-specific. Or he can choose to use a client-side generator. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Magnus Hagander wrote: >> Gevik Babakhani wrote: >>>> I'd be willing to accept a core uuid type sans generator function, >>>> but is that really all that useful? >>>> >>> This is also a point I remember from the last discussions. To not to >>> include the generator in the core. The generation of the uuid is then >>> going to be on the client side. >>> >>> The uuid type is very useful, especially when migrating from other >>> systems to pg (ms->pg or syb->pg). >> But does it really help if you don't have the generator? > > We could have all the type code in core, and the generator in contrib or > pgfoundry. That way the user can choose the most appropriate generator, > even if it's platform-specific. Or he can choose to use a client-side > generator. that seems like a good compromise - have the type in core and generators in contrib/pgfoundry. In one or two releases we might see some feedback on the portability and how people use those and could decide on leaving it that way or move the generators into core as well. Stefan
Gevik Babakhani wrote: > There could be many algorithms to generate a guid. I guess we will > get into a big debate on that, which is not much useful i guess > (seeing the posts last year). There are a handful of standardized or established algorithms, so it doesn't hurt to provide them all. Compare pgcrypto -- certainly no one needs all those encryption algorithms, but we offer them. -- Peter Eisentraut http://developer.postgresql.org/~petere/
what is the next step now? is there going to be review by a committer? if so, please note that the OIDs in the patch have to be changed. Regards, Gevik
On Fri, 2007-01-19 at 23:00 -0500, Tom Lane wrote: > Per previous discussion, the main problem with a uuid type is the > new-uuid generator function, which tends to involve a bunch of > not-so-portable assumptions and code. RFC 4122 specifies several ways of generating UUIDs: * via the computer's MAC address and the time since the Gregorian epoch in nanoseconds * via MD5 or SHA1 hashing of a given string of URL, or similar identifier * via a PSRNG Only the first of these presents any portability concerns, AFAICS. -Neil
On Sat, 2007-01-20 at 14:56 +0100, Gevik Babakhani wrote: > what is the next step now? is there going to be review by a committer? > if so, please note that the OIDs in the patch have to be changed. I already suggested a few things you could improve in the patch. If this discussion concludes that the patch should be included in the core backend and you submit a revised patch, I'd be happy to review and apply it. -Neil
> I already suggested a few things you could improve in the patch. If this > discussion concludes that the patch should be included in the core > backend and you submit a revised patch, I'd be happy to review and apply > it. So.. do we agree for uuid to be included in the core? If so.. I will change the assigned OIDs in the patch to match the current source-tree and update the code with the suggestions provided by Neil. Are we okay on this? Regards, Gevik.
On Sun, 2007-01-21 at 00:17 +0100, Gevik Babakhani wrote: > So.. do we agree for uuid to be included in the core? I'd be curious to know the degree to which the proposed patch is consistent with RFC 4122, which AFAIK is the most recent relevant standard. With regard to functions for generating UUIDs, I think we should at least include the methods specified by RFC 4122 that can be implemented without too many unportable assumptions. I believe that means MD5 & SHA1 hashing of an arbitrary identifier, and UUIDs generated via a PSRNG. -Neil
Neil Conway wrote: > On Sun, 2007-01-21 at 00:17 +0100, Gevik Babakhani wrote: >> So.. do we agree for uuid to be included in the core? > > I'd be curious to know the degree to which the proposed patch is > consistent with RFC 4122, which AFAIK is the most recent relevant > standard. > > With regard to functions for generating UUIDs, I think we should at > least include the methods specified by RFC 4122 that can be implemented > without too many unportable assumptions. I believe that means MD5 & SHA1 > hashing of an arbitrary identifier, and UUIDs generated via a PSRNG. I thought the consensus was to provide the only atatype initially and look into providing the generator functions later or via an external project (pgfoundry or contrib/). Stefan
Gevik Babakhani wrote: > So.. do we agree for uuid to be included in the core? I suggest that you read the discussion in the tsearch thread about figuring out how to make contrib modules more attractive. I don't see a reason why uuid has to be in the core, but I do see that there needs to be some centrally organized consolidation of the various existing attempts under a label of officiality. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Thu, 2007-01-25 at 17:57 +0100, Stefan Kaltenbrunner wrote: > I thought the consensus was to provide the only atatype initially and > look into providing the generator functions later or via an external > project (pgfoundry or contrib/). I don't think distributing the (portable) generator functions separately makes a lot of sense. For the generation methods that just depend on md5() or random(), we may as well include them in the backend if we're going to include the rest of the UUID stuff. The MAC-based generator function could also be included in the backend, actually: it just needs to take an argument of type "macaddr". It would then be up to the user (and/or various pgfoundry and contrib/ modules) to find a way to determine the local machine's MAC address, which presumably can't be done reliably in a portable fashion. -Neil
Peter Eisentraut wrote: > Gevik Babakhani wrote: >> So.. do we agree for uuid to be included in the core? > > I suggest that you read the discussion in the tsearch thread about > figuring out how to make contrib modules more attractive. I don't see > a reason why uuid has to be in the core, but I do see that there needs > to be some centrally organized consolidation of the various existing > attempts under a label of officiality. I think it would be more important to determine how we can get UUID in core. It is a known and accepted way of doing things in the marketplace and professional communities. It is time we stop fighting features for the sake of fighting features. Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Neil Conway wrote: > On Thu, 2007-01-25 at 17:57 +0100, Stefan Kaltenbrunner wrote: > > I thought the consensus was to provide the only atatype initially and > > look into providing the generator functions later or via an external > > project (pgfoundry or contrib/). > > I don't think distributing the (portable) generator functions separately > makes a lot of sense. For the generation methods that just depend on > md5() or random(), we may as well include them in the backend if we're > going to include the rest of the UUID stuff. > > The MAC-based generator function could also be included in the backend, > actually: it just needs to take an argument of type "macaddr". It would > then be up to the user (and/or various pgfoundry and contrib/ modules) > to find a way to determine the local machine's MAC address, which > presumably can't be done reliably in a portable fashion. I assume we could just allow the MAC address or some unique idenfier to be specified in postgesql.conf. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
>> The MAC-based generator function could also be included in the backend, >> actually: it just needs to take an argument of type "macaddr". It would >> then be up to the user (and/or various pgfoundry and contrib/ modules) >> to find a way to determine the local machine's MAC address, which >> presumably can't be done reliably in a portable fashion. > > I assume we could just allow the MAC address or some unique idenfier to > be specified in postgesql.conf. Well at that point why don't we allow it to be specified per database? ALTER DATABASE foo SET uuid_salt = :) That would be pretty cool, but I am sure most will shoot that down in flames :) Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
> I thought the consensus was to provide the only atatype initially and > look into providing the generator functions later or via an external > project (pgfoundry or contrib/). This was my understanding too... to include the uuid in the core and let the actual value be generated elsewhere...(client or separate project)... I do not think having a uuid datatype as contrib module separately will do us much good. All professional dbs support this as built in. So why shouldn't we... Regards, Gevik.
On Jan 25, 2007, at 11:13 AM, Peter Eisentraut wrote: > I suggest that you read the discussion in the tsearch thread about > figuring out how to make contrib modules more attractive. I don't see > a reason why uuid has to be in the core, but I do see that there needs > to be some centrally organized consolidation of the various existing > attempts under a label of officiality. Following that logic we should remove all data types that aren't specified in ANSI. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Jim Nasby wrote: > Following that logic we should remove all data types that aren't > specified in ANSI. Sure, if we were to arrive at some acceptable implementation of official modules, that would make sense in some cases. -- Peter Eisentraut http://developer.postgresql.org/~petere/