Thread: Patch for UUID datatype (beta)
Folks, The following patch implements the UUID datatype. I would like to send this beta patch to see if I still am on the right track. Please send your comments. Description of UUID: - The type is called uuid. - btree and hash indexes are supported. - uuid array is supported. - uuid text i/o is supported. - uuid binary i/o is supported. - uuid_to_text and text_to_uuid casting is supported. - uuid_to_varchar and varchar_to_uuid casting is supported. - the < <= = => > <> operators are supported. Please note that some of these operators mathematically have no meaning and are only good for sorting. - new_guid() function is supported. This function is based on V4 random uuid value. It generated 16 random bytes with uuid 'variant' and 'version'. It is not guaranteed to produce unique values according to the docs but I have inserted 6 million records and it did not create any duplicates :) - the uuid datatype supports 3 input formats: 1. "00000000-0000-0000-0000-00000000" 2. "0000000000000000000000000000" 3. "{00000000-0000-0000-0000-00000000}" - the uuid datatype supports the defined output format by RFC: "00000000-0000-0000-0000-00000000" Areas yet in development and testing: - uuid array indexing. - testing with joins (merge,hash,gin) - new_guid() fail proof testing - performance testing - testing with internal storage and compression. - regression test addition - proper documentation - overall sanity testing/checking Please note that I consider this a beta patch. You can download it from: http://www.truesoftware.net/pgsql/uuid/patch-0.1/ Regards, Gevik.
Gevik Babakhani wrote: > - new_guid() function is supported. This function is based on V4 random > uuid value. It generated 16 random bytes with uuid 'variant' and > 'version'. It is not guaranteed to produce unique values Isn't guaranteed uniqueness the very attribute that's expected? AFAIK there's a commonly accepted algorithm providing this. Regards, Andreas
On Mon, 2006-09-18 at 09:21 +0200, Andreas Pflug wrote: > Gevik Babakhani wrote: > > - new_guid() function is supported. This function is based on V4 random > > uuid value. It generated 16 random bytes with uuid 'variant' and > > 'version'. It is not guaranteed to produce unique values > > Isn't guaranteed uniqueness the very attribute that's expected? AFAIK > there's a commonly accepted algorithm providing this. > uniqueness is never a guaranteed. that is according to the RFC docs. However the new_guid() generates a random value in the range of 256^256. The random value is again based on the PG's randomizer which is a very good one. uniqueness is never a guaranteed in the sense that there is a tiny chance someone of the other side of the planet might generate the same guid. Or if you set your PC's clock back to the past (1981) you have a tiny chance to generate a same guid twice. I am running a test that is going on for the past two days, in has generated over 14 million guids with new_guid() and yet no duplicates :) Regards, Gevik > Regards, > Andreas > >
Gevik,
>uniqueness is never a guaranteed. that is according to the RFC docs.
>uniqueness is never a guaranteed in the sense that there is a tiny
>chance someone of the other side of the planet might generate the same
>guid.
As much as I learned, it is recommended to give information about "grade of uniqueness". I think it would be a valuable information, which information your UUID-generator takes into account, and what the "grade of uniqueness" is.
(I know of the Windows UUID, which takes the MAC-Address of the included Ethernet-Card into it's calculation, which may be guaranteed to be unique)
Some more questions about UUIDs and your patch:
a) compatibility of UUIDs -> I have generated a lot of UUIDs via the WIN32 provided function (for the unix-only-people: Windows uses UUIDs all around its registry, software IDs and on and on). How unique are those UUIDs when mixed with "your" UUIDs ?
b) I read some time ago about the problems with UUIDs as primary keys in contrast to serials: serials get produced in ascending order; and often data which was produced in one timespan is also connected semantically. "near serial values" are also local within a btree-index; but UUIDs generated in "near times" are usually spread around the possible bitranges.
(example for sequence of serials: 1 - 2 - 3 - 4 - 5 - 6
example for sequence of UUIDs : 1 - 999919281921843191 - 782 - 18291831912318971231)
that is supposed to affect the locality of the index, and from that also the performance of the system.
I do not know how valid this information is; so I am asking you for your feedback; especially since you put a lot of thoughts into this UUID patch. Maybe you took allready care of this situation when constructing the index operators?
Thanks
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.
>uniqueness is never a guaranteed. that is according to the RFC docs.
>uniqueness is never a guaranteed in the sense that there is a tiny
>chance someone of the other side of the planet might generate the same
>guid.
As much as I learned, it is recommended to give information about "grade of uniqueness". I think it would be a valuable information, which information your UUID-generator takes into account, and what the "grade of uniqueness" is.
(I know of the Windows UUID, which takes the MAC-Address of the included Ethernet-Card into it's calculation, which may be guaranteed to be unique)
Some more questions about UUIDs and your patch:
a) compatibility of UUIDs -> I have generated a lot of UUIDs via the WIN32 provided function (for the unix-only-people: Windows uses UUIDs all around its registry, software IDs and on and on). How unique are those UUIDs when mixed with "your" UUIDs ?
b) I read some time ago about the problems with UUIDs as primary keys in contrast to serials: serials get produced in ascending order; and often data which was produced in one timespan is also connected semantically. "near serial values" are also local within a btree-index; but UUIDs generated in "near times" are usually spread around the possible bitranges.
(example for sequence of serials: 1 - 2 - 3 - 4 - 5 - 6
example for sequence of UUIDs : 1 - 999919281921843191 - 782 - 18291831912318971231)
that is supposed to affect the locality of the index, and from that also the performance of the system.
I do not know how valid this information is; so I am asking you for your feedback; especially since you put a lot of thoughts into this UUID patch. Maybe you took allready care of this situation when constructing the index operators?
Thanks
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.
Andreas Pflug <pgadmin@pse-consulting.de> writes: > Isn't guaranteed uniqueness the very attribute that's expected? AFAIK > there's a commonly accepted algorithm providing this. Anyone who thinks UUIDs are guaranteed unique has been drinking too much of the kool-aid. They're at best probably unique. Some generator algorithms might make it more probable than others, but you simply cannot "guarantee" it for UUIDs generated on noncommunicating machines. One of the big reasons that I'm hesitant to put a UUID generation function into core is the knowledge that none of them are or can be perfect ... so people might need different ones depending on local conditions. I'm inclined to think that a reasonable setup would put the datatype (with input, output, comparison and indexing support) into core, but provide a generation function as a contrib module, making it easily replaceable. regards, tom lane
Completely agreed. I can remove the function from the patch. The temptation was just too high not to include the new_guid() in the patch :) On Mon, 2006-09-18 at 10:33 -0400, Tom Lane wrote: > Andreas Pflug <pgadmin@pse-consulting.de> writes: > > Isn't guaranteed uniqueness the very attribute that's expected? AFAIK > > there's a commonly accepted algorithm providing this. > > Anyone who thinks UUIDs are guaranteed unique has been drinking too much > of the kool-aid. They're at best probably unique. Some generator > algorithms might make it more probable than others, but you simply > cannot "guarantee" it for UUIDs generated on noncommunicating machines. > > One of the big reasons that I'm hesitant to put a UUID generation > function into core is the knowledge that none of them are or can be > perfect ... so people might need different ones depending on local > conditions. I'm inclined to think that a reasonable setup would put > the datatype (with input, output, comparison and indexing support) > into core, but provide a generation function as a contrib module, > making it easily replaceable. > > regards, tom lane >
On Mon, Sep 18, 2006 at 10:33:22AM -0400, Tom Lane wrote: > Andreas Pflug <pgadmin@pse-consulting.de> writes: > > Isn't guaranteed uniqueness the very attribute that's expected? AFAIK > > there's a commonly accepted algorithm providing this. > Anyone who thinks UUIDs are guaranteed unique has been drinking too much > of the kool-aid. They're at best probably unique. Some generator > algorithms might make it more probable than others, but you simply > cannot "guarantee" it for UUIDs generated on noncommunicating machines. The versions that include a MAC address, time, and serial number for the machine come pretty close, presuming that the user has not overwritten the MAC address with something else. It's unique at manufacturing time. If the generation is performed from a library with the same state, on the same machine, on the off chance that you do request multiple generations at the same exact time (from my experience, this is already unlikely) the serial number should be bumped for that time. So yeah - if you set your MAC address, or if your machine time is ever set back, or if you assume a serial number of 0 each time (generation routine isn't shared among processes on the system), you can get overlap. All of these can be controlled, making it possible to eliminate overlap. > One of the big reasons that I'm hesitant to put a UUID generation > function into core is the knowledge that none of them are or can be > perfect ... so people might need different ones depending on local > conditions. I'm inclined to think that a reasonable setup would put > the datatype (with input, output, comparison and indexing support) > into core, but provide a generation function as a contrib module, > making it easily replaceable. I have UUID generation in core in my current implementation. In the last year that I've been using it, I have already chosen twice to generate UUIDs from my calling program. I find it faster, as it avoids have to call out to PostgreSQL twice. Once to generate the UUID, and once to insert the row using it. I have no strong need for UUID generation to be in core, and believe there does exist strong reasons not to. Performance is better when not in core. Portability of PostgreSQL is better when not in core. Ability to control how UUID is defined is better when not in control. The only thing an in-core version provides is convenience for those that do not have easy access to a UUID generation library. I don't care for that convenience. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
If you're going to yank it, please at least include a generator in contrib. Personally, I'd like to see at least some kind of generator in core, with appropriate info/disclaimers in the docs. A simple random-number generator is probably the best way to go in that regard. I think that most people know that UUID generation isn't 100.00000% perfect. BTW, at a former company we used SHA1s to identify files that had been uploaded. We were wondering on the odds of 2 different files hashing to the same value and found some statistical comparisons of probabilities. I don't recall the details, but the odds of duplicating a SHA1 (1 in 2^160) are so insanely small that it's hard to find anything in the physical world that compares. To duplicate random 256^256 numbers you'd probably have to search until the heat-death of the universe. On Mon, Sep 18, 2006 at 05:14:22PM +0200, Gevik Babakhani wrote: > Completely agreed. I can remove the function from the patch. The > temptation was just too high not to include the new_guid() in the > patch :) > > > On Mon, 2006-09-18 at 10:33 -0400, Tom Lane wrote: > > Andreas Pflug <pgadmin@pse-consulting.de> writes: > > > Isn't guaranteed uniqueness the very attribute that's expected? AFAIK > > > there's a commonly accepted algorithm providing this. > > > > Anyone who thinks UUIDs are guaranteed unique has been drinking too much > > of the kool-aid. They're at best probably unique. Some generator > > algorithms might make it more probable than others, but you simply > > cannot "guarantee" it for UUIDs generated on noncommunicating machines. > > > > One of the big reasons that I'm hesitant to put a UUID generation > > function into core is the knowledge that none of them are or can be > > perfect ... so people might need different ones depending on local > > conditions. I'm inclined to think that a reasonable setup would put > > the datatype (with input, output, comparison and indexing support) > > into core, but provide a generation function as a contrib module, > > making it easily replaceable. > > > > regards, tom lane > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- Jim Nasby jimn@enterprisedb.com EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Mon, Sep 18, 2006 at 04:00:22PM -0500, Jim C. Nasby wrote: > BTW, at a former company we used SHA1s to identify files that had been > uploaded. We were wondering on the odds of 2 different files hashing to > the same value and found some statistical comparisons of probabilities. > I don't recall the details, but the odds of duplicating a SHA1 (1 in > 2^160) are so insanely small that it's hard to find anything in the > physical world that compares. To duplicate random 256^256 numbers you'd > probably have to search until the heat-death of the universe. The birthday paradox gives you about 2^80 (about 10^24) files before a SHA1 match, which is huge enough as it is. AIUI a UUID is only 2^128 bits so that would make 2^64 (about 10^19) random strings before you get a duplicate. Embed the time in there and the chance becomes *really* small, because then you have to get it in the same second. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
On Mon, Sep 18, 2006 at 12:23:16PM -0400, mark@mark.mielke.cc wrote: > I have UUID generation in core in my current implementation. In the > last year that I've been using it, I have already chosen twice to > generate UUIDs from my calling program. I find it faster, as it avoids > have to call out to PostgreSQL twice. Once to generate the UUID, and > once to insert the row using it. I have no strong need for UUID > generation to be in core, and believe there does exist strong reasons > not to. Performance is better when not in core. Portability of > PostgreSQL is better when not in core. Ability to control how UUID is > defined is better when not in control. That's kinda short-sighted. You're assuming that the only place you'll want to generate UUIDs is outside the database. What about a stored procedure that's adding data to the database? How about populating a table via a SELECT INTO? There's any number of cases where you'd want to generate a UUID inside the database. > The only thing an in-core version provides is convenience for those > that do not have easy access to a UUID generation library. I don't > care for that convenience. It's not about access to a library, it's about how do you get to that library from inside the database, which may not be very easy. You may not care for that convenience, but I certainly would. -- Jim Nasby jimn@enterprisedb.com EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Mon, 2006-09-18 at 16:00 -0500, Jim C. Nasby wrote: > BTW, at a former company we used SHA1s to identify files that had been > uploaded. We were wondering on the odds of 2 different files hashing to > the same value and found some statistical comparisons of probabilities. > I don't recall the details, but the odds of duplicating a SHA1 (1 in > 2^160) are so insanely small that it's hard to find anything in the > physical world that compares. To duplicate random 256^256 numbers you'd > probably have to search until the heat-death of the universe. That assumes you have good random data. Usually there is some kind of tradeoff between the randomness and the performance. If you read /dev/random each time, that eliminates some applications that need to generate UUIDs very quickly. If you use pseudorandom data, you are vulnerable in the case a clock is set back or the data repeats. Regards, Jeff Davis
If you have trouble with duplicate OIDs Please use patch-0.2 for testing. I have changed the OIDs to 5000 range. You can download it from: http://www.truesoftware.net/pgsql/uuid/patch-0.2/ On Mon, 2006-09-18 at 01:00 +0200, Gevik Babakhani wrote: > Folks, > > The following patch implements the UUID datatype. I would like to send > this beta patch to see if I still am on the right track. Please send > your comments. > > Description of UUID: > > - The type is called uuid. > - btree and hash indexes are supported. > - uuid array is supported. > - uuid text i/o is supported. > - uuid binary i/o is supported. > - uuid_to_text and text_to_uuid casting is supported. > - uuid_to_varchar and varchar_to_uuid casting is supported. > - the < <= = => > <> operators are supported. Please note that some of > these operators mathematically have no meaning and are only good for > sorting. > > - new_guid() function is supported. This function is based on V4 random > uuid value. It generated 16 random bytes with uuid 'variant' and > 'version'. It is not guaranteed to produce unique values according to > the docs but I have inserted 6 million records and it did not create any > duplicates :) > > - the uuid datatype supports 3 input formats: > 1. "00000000-0000-0000-0000-00000000" > 2. "0000000000000000000000000000" > 3. "{00000000-0000-0000-0000-00000000}" > > - the uuid datatype supports the defined output format by RFC: > "00000000-0000-0000-0000-00000000" > > > Areas yet in development and testing: > > - uuid array indexing. > - testing with joins (merge,hash,gin) > - new_guid() fail proof testing > - performance testing > - testing with internal storage and compression. > - regression test addition > - proper documentation > - overall sanity testing/checking > > Please note that I consider this a beta patch. > You can download it from: > http://www.truesoftware.net/pgsql/uuid/patch-0.1/ > > > Regards, > Gevik. > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
On Mon, Sep 18, 2006 at 04:17:50PM -0500, Jim C. Nasby wrote: > On Mon, Sep 18, 2006 at 12:23:16PM -0400, mark@mark.mielke.cc wrote: > > I have UUID generation in core in my current implementation. In the > > last year that I've been using it, I have already chosen twice to > > generate UUIDs from my calling program. I find it faster, as it avoids > > have to call out to PostgreSQL twice. Once to generate the UUID, and > > once to insert the row using it. I have no strong need for UUID > > generation to be in core, and believe there does exist strong reasons > > not to. Performance is better when not in core. Portability of > > PostgreSQL is better when not in core. Ability to control how UUID is > > defined is better when not in control. > That's kinda short-sighted. You're assuming that the only place you'll > want to generate UUIDs is outside the database. What about a stored > procedure that's adding data to the database? How about populating a > table via a SELECT INTO? There's any number of cases where you'd want to > generate a UUID inside the database. contrib module. > > The only thing an in-core version provides is convenience for those > > that do not have easy access to a UUID generation library. I don't > > care for that convenience. > It's not about access to a library, it's about how do you get to that > library from inside the database, which may not be very easy. > You may not care for that convenience, but I certainly would. Then load the contrib module. I do both. I'd happily reduce my contrib module to be based upon a built-in UUID type within PostgreSQL, providing the necessary UUID generation routines. I would not use a 100% random number generator for a UUID value as was suggested. I prefer inserting the MAC address and the time, to at least allow me to control if a collision is possible. This is not easy to do using a few lines of C code. I'd rather have a UUID type in core with no generation routine, than no UUID type in core because the code is too complicated to maintain, or not portable enough. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
On Mon, Sep 18, 2006 at 07:45:07PM -0400, mark@mark.mielke.cc wrote: > I would not use a 100% random number generator for a UUID value as was > suggested. I prefer inserting the MAC address and the time, to at > least allow me to control if a collision is possible. This is not easy > to do using a few lines of C code. I'd rather have a UUID type in core > with no generation routine, than no UUID type in core because the code > is too complicated to maintain, or not portable enough. As others have mentioned, using MAC address doesn't remove the possibility of a collision. Maybe a good compromise that would allow a generator function to go into the backend would be to combine the current time with a random number. That will ensure that you won't get a dupe, so long as your clock never runs backwards. -- Jim Nasby jimn@enterprisedb.com EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Tue, Sep 19, 2006 at 08:20:13AM -0500, Jim C. Nasby wrote: > On Mon, Sep 18, 2006 at 07:45:07PM -0400, mark@mark.mielke.cc wrote: > > I would not use a 100% random number generator for a UUID value as was > > suggested. I prefer inserting the MAC address and the time, to at > > least allow me to control if a collision is possible. This is not easy > > to do using a few lines of C code. I'd rather have a UUID type in core > > with no generation routine, than no UUID type in core because the code > > is too complicated to maintain, or not portable enough. > As others have mentioned, using MAC address doesn't remove the > possibility of a collision. It does, as I control the MAC address. I can choose not to overwrite it. I can choose to ensure that any cases where it is overwritten, it is overwritten with a unique value. Random number does not provide this level of control. > Maybe a good compromise that would allow a generator function to go into > the backend would be to combine the current time with a random number. > That will ensure that you won't get a dupe, so long as your clock never > runs backwards. Which standard UUID generation function would you be thinking of? Inventing a new one doesn't seem sensible. I'll have to read over the versions again... Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
On Tue, Sep 19, 2006 at 09:51:23AM -0400, mark@mark.mielke.cc wrote: > On Tue, Sep 19, 2006 at 08:20:13AM -0500, Jim C. Nasby wrote: > > On Mon, Sep 18, 2006 at 07:45:07PM -0400, mark@mark.mielke.cc wrote: > > > I would not use a 100% random number generator for a UUID value as was > > > suggested. I prefer inserting the MAC address and the time, to at > > > least allow me to control if a collision is possible. This is not easy > > > to do using a few lines of C code. I'd rather have a UUID type in core > > > with no generation routine, than no UUID type in core because the code > > > is too complicated to maintain, or not portable enough. > > As others have mentioned, using MAC address doesn't remove the > > possibility of a collision. > > It does, as I control the MAC address. I can choose not to overwrite it. > I can choose to ensure that any cases where it is overwritten, it is > overwritten with a unique value. Random number does not provide this > level of control. > > > Maybe a good compromise that would allow a generator function to go into > > the backend would be to combine the current time with a random number. > > That will ensure that you won't get a dupe, so long as your clock never > > runs backwards. > > Which standard UUID generation function would you be thinking of? > Inventing a new one doesn't seem sensible. I'll have to read over the > versions again... I don't think it exists, but I don't see how that's an issue. Let's look at an extreme case: take the amount of random entropy used for the random-only generation method. Append that to the current time in UTC, and hash it. Thanks to the time component, you've now greatly reduced the odds of a duplicate, probably by many orders of magnitude. Ultimately, I'm OK with a generator that's only in contrib, provided that there's at least one that will work on all OSes. -- Jim Nasby jimn@enterprisedb.com EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
mark@mark.mielke.cc wrote: > On Tue, Sep 19, 2006 at 08:20:13AM -0500, Jim C. Nasby wrote: > > On Mon, Sep 18, 2006 at 07:45:07PM -0400, mark@mark.mielke.cc wrote: > > > I would not use a 100% random number generator for a UUID value as was > > > suggested. I prefer inserting the MAC address and the time, to at > > > least allow me to control if a collision is possible. This is not easy > > > to do using a few lines of C code. I'd rather have a UUID type in core > > > with no generation routine, than no UUID type in core because the code > > > is too complicated to maintain, or not portable enough. > > As others have mentioned, using MAC address doesn't remove the > > possibility of a collision. > > It does, as I control the MAC address. What happens if you have two postmaster running on the same machine? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Tue, Sep 19, 2006 at 11:21:51PM -0400, Alvaro Herrera wrote: > mark@mark.mielke.cc wrote: > > On Tue, Sep 19, 2006 at 08:20:13AM -0500, Jim C. Nasby wrote: > > > On Mon, Sep 18, 2006 at 07:45:07PM -0400, mark@mark.mielke.cc wrote: > > > > I would not use a 100% random number generator for a UUID value as was > > > > suggested. I prefer inserting the MAC address and the time, to at > > > > least allow me to control if a collision is possible. This is not easy > > > > to do using a few lines of C code. I'd rather have a UUID type in core > > > > with no generation routine, than no UUID type in core because the code > > > > is too complicated to maintain, or not portable enough. > > > As others have mentioned, using MAC address doesn't remove the > > > possibility of a collision. > > It does, as I control the MAC address. > What happens if you have two postmaster running on the same machine? Could be bad things. :-) For the case of two postmaster processes, I assume you mean two different databases? If you never intend to merge the data between the two databases, the problem is irrelevant. There is a much greater chance that any UUID form is more unique, or can be guaranteed to be unique, within a single application instance, than across all application instances in existence. If you do intend to merge the data, you may have a problem. If I have two connections to PostgreSQL - would the plpgsql procedures be executed from two different processes? With an in-core generation routine, I think it is possible for it to collide unless inter-process synchronization is used (unlikely) to ensure generation of unique time/sequence combinations each time. I use this right now (mostly), but as I've mentioned, it isn't my favourite. It's convenient. I don't believe it provides the sort of guarantees that a SERIAL provides. A model that intended to try and guarantee uniqueness would provide a UUID generation service for the entire host, that was not specific to any application, or database, possibly accessible via the loopback address. It would ensure that at any given time, either the time is new, or the sequence is new for the time. If computer time ever went backwards, it could keep the last time issued persistent, and increment from this point forward through the clock sequence values until real time catches up. An alternative would be along the lines of a /dev/uuid device, that like /dev/random, would be responsible for outputting unique uuid values for the system. Who does this? Probably nobody. I'm tempted to implement it, though, for my uses. :-) Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
Mark,
That is an excellent summary. There is just one wrong assumption in it:
>Probably nobody.
Within win32 there is an API call, which provides you with an GUID / UUID with to my knowledge exactly the features you are describing. win32 is installed on some computers. So for PostgreSQL on win32 the new_guid() you describe in detail would be quite simple to implement: a call to CoCreateGuid.
The challenging part is: I use PostgreSQL in a mixed environment. And Linux i.e. does not provide CoCreateGuid. That's why I am voting to have it in PostgreSQL :)
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.
A model that intended to try and guarantee uniqueness would provide a
UUID generation service for the entire host, that was not specific to
any application, or database, possibly accessible via the loopback
address. It would ensure that at any given time, either the time is
new, or the sequence is new for the time. If computer time ever went
backwards, it could keep the last time issued persistent, and
increment from this point forward through the clock sequence values
until real time catches up. An alternative would be along the lines of
a /dev/uuid device, that like /dev/random, would be responsible for
outputting unique uuid values for the system. Who does this? Probably
nobody. I'm tempted to implement it, though, for my uses. :-)
That is an excellent summary. There is just one wrong assumption in it:
>Probably nobody.
Within win32 there is an API call, which provides you with an GUID / UUID with to my knowledge exactly the features you are describing. win32 is installed on some computers. So for PostgreSQL on win32 the new_guid() you describe in detail would be quite simple to implement: a call to CoCreateGuid.
The challenging part is: I use PostgreSQL in a mixed environment. And Linux i.e. does not provide CoCreateGuid. That's why I am voting to have it in PostgreSQL :)
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.