Re: uuid type for postgres - Mailing list pgsql-hackers
From | mark@mark.mielke.cc |
---|---|
Subject | Re: uuid type for postgres |
Date | |
Msg-id | 20050906203811.GA3069@mark.mielke.cc Whole thread Raw |
In response to | Re: uuid type for postgres ("Jonah H. Harris" <jonah.harris@gmail.com>) |
Responses |
Re: uuid type for postgres
Re: uuid type for postgres |
List | pgsql-hackers |
On Tue, Sep 06, 2005 at 03:57:55PM -0400, Jonah H. Harris wrote: > I agree with Josh on the UUID type, it gets abused far too often and (IMHO) > isn't widely enough used to belong in the core. There is much in PostgreSQL from my perspective that falls under the category of 'most advanced open source SQL server in the world', and not at all in the category of 'widely used'. Unless the history of PostgreSQL isn't to be accepted in terms of direction for PostgreSQL, I don't see why such a useful building block shouldn't be supported by the core. It may not be widely used, because it doesn't exist. Not yet, anyways. pguuid is broken, remember? Which isn't to say that it should be in the core, or it must be in the core. It is to say, however, that I don't buy your arguments. Abused how? How can you judge widely used for something that doesn't exist? I'm not very newbie oriented. I think I tend to assume that anybody using PostgreSQL must know what they are doing. This may be very wrong of me to assume. I assume you all know what you are doing, for example. I'm curious as to what sort of abuses you and Josh are speaking about. I'd particularly like to know if you think I am abusing it, as I don't believe that I am. I would also be interested in knowing what you felt would be an abuse with UUID, that wouldn't be an abuse with SERIAL. Should we tell newbies not to use SERIAL? Perhaps we should. Do we? > Couldn't you just fix the > problem in pguuid rather than write a whole new type? I'll submit my patches to pguuid once I've had some more production use of it. For the little bit of code that it was, there were some pretty bad bugs. It was sometimes crashing, due to what appears to be a palloc() that was one byte short (it reliably crashed on me, the line is wrong as written, and after adding + 1, it no longer crashes at all). The operator definitions weren't correct, and the indices on columns of the type were not being preferred by the query planner (a good thing too - if they were used for any operator other than '=', the errors in the operator definitions would have caused very unexpected results). So yes, pguuid can be fixed. I'm not sure that pguuid makes a good model for a PostgreSQL extension, but it could be resurrected and used. (I think the project showed no real updates since 2003?) But - it's GPL, limiting its use with regard to PostgreSQL distribution, and it requires special compilation, which as Nathan found out, doesn't work on MacOS. I think the GPL reason alone is a compelling reason to create a new extension. Using a more portable UUID base library (either written from scratch, or re-used from some other place providing a compatible license) is a compelling reason to create a new extension. As to whether it should be in core - I think that a fully functional module could be widely used, especially in larger systems that are having difficulty spreading transactions across multiple machines. This is the domain that UUID shines in. I can write a transaction at my site without being very worried that it will collide with a transaction at your site. We don't have to be connected in real time. In my choice of use, I'm using them instead of SERIAL columns, as I wish to have more freedom merging production data with test data. I wish to continually import production data into my test environment, in a single direction. UUID will prevent conflicts from occurring. SERIAL cannot (although in theory, I could set my copy of the serial value to 1 billion or something hacky - but that doesn't scale in a simple fashion to having several test environments). My other preferred use, is to expose a handle on the data to the world in select circumstances. I don't want to give them a SERIAL column, as it lets them be able to predict what else they might have access to. I'm not using it to secure the data (other routines will do this), but I am using it to hide the data. Why should people accessing my system know how many records exist in my tables? Why should they be able to predict the next value? Why should they be able to relate the data, or mine my data in the case that I allow read to all? Yes, I can avoid exposing the UUID/SERIAL in most circumstances. They can get at most of the data through a name based path. For a few of my pieces of data, though, I want to provide a secondary means of accessing the data that does not require a path. Names cause problems, especially if the names contain UNICODE characters, or if the names are very long. Giving them a reliably unique handle of fixed length is highly desirable to me in these circumstances. Anyways, I'm using a fixed up pguuid right now, and getting along fine. Nathan, under MacOS, isn't - with or without my patch. And I believe he stated he had issues with the GPL license. Josh may be correct, that in terms of a position statement, my arguments are all over the map, and ignorant of newbies. I'm not a speaker, or a writer. I'm a technical person who was frustrated with SERIAL, happy with UUID, frustrated to find pguuid broken, happy that it was fixable, frustrated with having to compile and install it separately, happy that there is interest from others (Nathan) in attempting to have this building block worked on, and perhaps eventually added to the core. Personally, I'm not sure what the big opposition to UUID is all about. Even abused, it can do a better job than SERIAL at quadruple the storage requirements (not a big deal when considering PostgreSQL's record overhead). If it'll make it more acceptable, I'll write the UUID implementation myself, and release it under whatever license you find most beneficial to PostgreSQL. :-) Cheers, mark > On 9/6/05, Josh Berkus <josh@agliodbs.com> wrote: > > > > Mark, > > > > > I suggest that UUID be recommended in place of SERIAL for certain > > > classes of applications, and that it therefore belongs in the core. > > > UUID and SERIAL can be used together (although, once you have a UUID, > > > it may not be useful to also have a SERIAL). > > > > I think that, if you want to push a refactored UUID type for PostgreSQL > > 8.2, that you'd better separate your database design arguments from your > > inclusion arguments. > > > > For example, you might get my agreement that it would be useful to have a > > UUID as a core type; you would *never* get my agreement to recommend using > > UUID to newbies. I have seen *far* too many abuses of UUIDs in really > > bad database design. People who use them should be experienced enough to > > know what they're doing. > > > > -- > > --Josh > > > > Josh Berkus > > Aglio Database Solutions > > San Francisco > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 9: In versions below 8.0, the planner will ignore your desire to > > choose an index scan if your joining column's datatypes do not > > match > > -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
pgsql-hackers by date: