Thread: enums
If people would like to play, I have created a little kit to help in creating first class enum types in a few seconds. It works something like this: make TYPENAME=rainbow ENUMS=' "red", "orange", "yellow", "green", "blue", "indigo", "violet" ' make TYPENAME=rainbow install psql -f /path/to/contrib/rainbow-install.sql yourdb and you are done. Now you can do: create table foo( r rainbow); insert into foo values('red'); select 'red'::rainbow < 'green'::rainbow; <-- yieldstrue select rainbow_order('yellow'); <-- yields 2 The kit is at http://developer.postgresql.org/~adunstan/enumkit.tgz Needs 8.0 or later, since it use PGXS. Maximum number of values is 32767 - but if you use that many you're insane anyway :-) I did this as part of thinking about how we might do enums properly. AS Chris KL recently noted - it is very often asked for. So this is not the end of the road, just a tiny step at the beginning. cheers andrew
This is cool; it's something people can use today if nothing else. Long-term, is it practical to have the enums compiled in? ISTM that's not very workable, but I'm completely guessing. The other issue is that this version makes it very difficult to change what's in the enum (not that that's at all easy with MySQL...) On Thu, Oct 27, 2005 at 03:47:53PM -0400, Andrew Dunstan wrote: > > If people would like to play, I have created a little kit to help in > creating first class enum types in a few seconds. It works something > like this: > > make TYPENAME=rainbow ENUMS=' "red", "orange", "yellow", "green", > "blue", "indigo", "violet" ' > make TYPENAME=rainbow install > psql -f /path/to/contrib/rainbow-install.sql yourdb > > and you are done. Now you can do: > > create table foo( r rainbow); > insert into foo values('red'); > select 'red'::rainbow < 'green'::rainbow; <-- yields true > select rainbow_order('yellow'); <-- yields 2 > > The kit is at http://developer.postgresql.org/~adunstan/enumkit.tgz > > Needs 8.0 or later, since it use PGXS. > > Maximum number of values is 32767 - but if you use that many you're > insane anyway :-) > > I did this as part of thinking about how we might do enums properly. AS > Chris KL recently noted - it is very often asked for. So this is not the > end of the road, just a tiny step at the beginning. > > cheers > > andrew > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
This little snippet is great! The only problem I see is that the enums must be consistent across all modules.
What about loading a variable with a "default" value? Then it could be adjusted to 'play'.
What about loading a variable with a "default" value? Then it could be adjusted to 'play'.
On 10/27/05, Jim C. Nasby < jnasby@pervasive.com> wrote:
This is cool; it's something people can use today if nothing else.
Long-term, is it practical to have the enums compiled in? ISTM that's
not very workable, but I'm completely guessing. The other issue is that
this version makes it very difficult to change what's in the enum (not
that that's at all easy with MySQL...)
On Thu, Oct 27, 2005 at 03:47:53PM -0400, Andrew Dunstan wrote:
>
> If people would like to play, I have created a little kit to help in
> creating first class enum types in a few seconds. It works something
> like this:
>
> make TYPENAME=rainbow ENUMS=' "red", "orange", "yellow", "green",
> "blue", "indigo", "violet" '
> make TYPENAME=rainbow install
> psql -f /path/to/contrib/rainbow-install.sql yourdb
>
> and you are done. Now you can do:
>
> create table foo( r rainbow);
> insert into foo values('red');
> select 'red'::rainbow < 'green'::rainbow; <-- yields true
> select rainbow_order('yellow'); <-- yields 2
>
> The kit is at http://developer.postgresql.org/~adunstan/enumkit.tgz
>
> Needs 8.0 or later, since it use PGXS.
>
> Maximum number of values is 32767 - but if you use that many you're
> insane anyway :-)
>
> I did this as part of thinking about how we might do enums properly. AS
> Chris KL recently noted - it is very often asked for. So this is not the
> end of the road, just a tiny step at the beginning.
>
> cheers
>
> andrew
On Thu, Oct 27, 2005 at 04:54:36PM -0400, Ted Rolle wrote: > This little snippet is great! The only problem I see is that the enums must > be consistent across all modules. > > What about loading a variable with a "default" value? Then it could be > adjusted to 'play'. Huh? Sorry, but you completely lost me here... On another note, I noticed that the comparison operators seem to be comparing the underlying numeric value used to store the enum, which is wrong IMO. Consider: ENUM color '"red","blue","green"' CREATE TABLE t (c color); INSERT INTO t VALUES('blue'); INSERT INTO t VALUES('green'); INSERT INTO t VALUES('red'); SELECT c FROM t ORDER BY c; red blue green That seems counter-intuitive. It's also exposing an implimentation detail (that the enum is stored internally as a number). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Adding -hackers back to the list... > -----Original Message----- > From: Gregory Maxwell [mailto:gmaxwell@gmail.com] > Sent: Thursday, October 27, 2005 5:03 PM > To: Jim Nasby > Subject: Re: [HACKERS] enums > > > On 10/27/05, Jim C. Nasby <jnasby@pervasive.com> wrote: > > On Thu, Oct 27, 2005 at 04:54:36PM -0400, Ted Rolle wrote: > > > This little snippet is great! The only problem I see is > that the enums must > > > be consistent across all modules. > > > > > > What about loading a variable with a "default" value? > Then it could be > > > adjusted to 'play'. > > > > Huh? Sorry, but you completely lost me here... > > > > On another note, I noticed that the comparison operators seem to be > > comparing the underlying numeric value used to store the > enum, which is > > wrong IMO. Consider: > > > > ENUM color '"red","blue","green"' > > CREATE TABLE t (c color); > > INSERT INTO t VALUES('blue'); > > INSERT INTO t VALUES('green'); > > INSERT INTO t VALUES('red'); > > SELECT c FROM t ORDER BY c; > > red > > blue > > green > > > > That seems counter-intuitive. It's also exposing an implimentation > > detail (that the enum is stored internally as a number). > > You could as equally say that it's ordering it by the order of the > enum declaration, which seems quite reasonable to me. I don't really see why that's considered reasonable, especially as a default. I could maybe see an argument for having ameans to see this ordering, but IMO anything depending on that is broken. I don't think we should be making any guaranteesabout how enums are stored in the database (including ordering). > Now, if you can multiply one enum with another or with an integer, > that would be a bit odd. But the collation behavior seems quite sane > to me. > > The behavior in mysql is the same: > > mysql> create table t ( > -> color enum ('red','blue','green') > -> ); > mysql> INSERT INTO t VALUES ('blue'); > mysql> INSERT INTO t VALUES ('green'); > mysql> INSERT INTO t VALUES ('red'); > mysql> select color from t order by color; > +-------+ > | color | > +-------+ > | red | > | blue | > | green | > +-------+ > 3 rows in set (0.04 sec) > > Hopefully we the end implimentation is a bit better than mysql and > actually cares what you're inserting: > > mysql> select color*2 from t; > +---------+ > | color*2 | > +---------+ > | 4 | > | 6 | > | 2 | > +---------+ > 3 rows in set (0.00 sec) > mysql> INSERT INTO t VALUES (1); > Query OK, 1 row affected (0.00 sec) > mysql> INSERT INTO t VALUES ('monkey'); > Query OK, 1 row affected, 1 warning (0.00 sec) > mysql> select color from t; > +-------+ > | color | > +-------+ > | blue | > | green | > | red | > | red | > | | > +-------+ > 5 rows in set (0.00 sec) Your examples show why I don't think it's a good idea to use MySQL as a guide for how to do enums. If we do decide to include the concept of ordering in enums, then it should be fully supported and not just an artifact ofour storage mechanism. This means supporting things like being able to re-order the accepted values in an enum. But likeI said, I just don't see the use case for doing that.
On 10/27/05, Jim Nasby <jnasby@pervasive.com> wrote: > Adding -hackers back to the list... > > You could as equally say that it's ordering it by the order of the > > enum declaration, which seems quite reasonable to me. > > I don't really see why that's considered reasonable, especially as a default. I could maybe see an argument for havinga means to see this ordering, but IMO anything depending on that is broken. I don't think we should be making any guaranteesabout how enums are stored in the database (including ordering). > Your examples show why I don't think it's a good idea to use MySQL as a guide for how to do enums. Yes, MySQL is broken in some regards, as usual. However, the API isn't bad (except for the fact that it doesn't care what invalid crap you throw at it), and more importantly there are thousands of apps and developers who think around that interface. We should copy it without the brokenness as much as possible unless we have good cause otherwise. > If we do decide to include the concept of ordering in enums, then it should be fully supported and not just an artifactof our storage mechanism. This means supporting things like being able to re-order the accepted values in an enum.But like I said, I just don't see the use case for doing that. So what do you propose we do for a default ordering? I hope you don't think we should force a sort as though the enum labels were text... That almost certainly incorrect for most applications of enums, which are used to make opaque labels more human compatible. MySQL's behavior of allowing the user to specify the collation in the typedef makes a lot of sense to me, it doesn't matter that it actually works as an artifact of the storage backend. I'd argue that it would make sense to sort by the specification order even if we changed the backend to use varchars rather than numbers.
Jim C. Nasby wrote: > >On another note, I noticed that the comparison operators seem to be >comparing the underlying numeric value used to store the enum, which is >wrong IMO. Consider: > >ENUM color '"red","blue","green"' >CREATE TABLE t (c color); >INSERT INTO t VALUES('blue'); >INSERT INTO t VALUES('green'); >INSERT INTO t VALUES('red'); >SELECT c FROM t ORDER BY c; >red >blue >green > >That seems counter-intuitive. It's also exposing an implimentation >detail (that the enum is stored internally as a number). > > No it is not. Not in the slightest. It is honoring the enumeration order defined for the type. That is the ONLY correct behaviour, IMNSHO. Otherwise, you could just as easily use a domain with a check constraint. In fact, mysql's behaviour is laughably, even ludicrously, inconsistent: mysql> select color from t order by color; +-------+ | color | +-------+ | red | | blue | | green | +-------+ 3 rows in set (0.06 sec) mysql> select * from t where color < 'green'; +-------+ | color | +-------+ | blue | +-------+ So for "order by" it honors the enumeration order, but for < it uses the lexical ordering. Lovely, eh? cheers andrew
On Thu, Oct 27, 2005 at 06:46:24PM -0400, Gregory Maxwell wrote: > So what do you propose we do for a default ordering? I hope you don't > think we should force a sort as though the enum labels were text... I do think that. Or default ordering on whatever type the enum is (I can see enums that are something other than text as useful, though that's a secondary goal). > That almost certainly incorrect for most applications of enums, which > are used to make opaque labels more human compatible. Sorting red before blue doesn't sound very opaque to me... > MySQL's behavior of allowing the user to specify the collation in the > typedef makes a lot of sense to me, it doesn't matter that it actually > works as an artifact of the storage backend. I'd argue that it would > make sense to sort by the specification order even if we changed the > backend to use varchars rather than numbers. Like I said, if we're going to support a concept of ordering of items in an enum then we need to support it fully. For starters that means having the ability to re-order things in an enum seamlessly. If our primary concern is MySQL compatability then we should look at offering two types of enums; one that mirrors their broken stuff and one that works they way you'd actually want it to. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Gregory Maxwell wrote: >Yes, MySQL is broken in some regards, as usual. However, the API isn't >bad (except for the fact that it doesn't care what invalid crap you >throw at it), and more importantly there are thousands of apps and >developers who think around that interface. We should copy it without >the brokenness as much as possible unless we have good cause >otherwise. > > mmm ... no. It is too broken. We should do enums orthogonally with other type definitions in PostgreSQL. Where I would like to get to is that we have a flavor of CREATE TYPE that will create the enum type for us, including all the support that I build into my little kit. And if you want to change the enumeration set on a column, you would use ALTER TABLE foo ALTER COLUMN bar TYPE newtype USING ... Inline declarations of enums does not strike me as good. > > >>If we do decide to include the concept of ordering in enums, then it should be fully supported and not just an artifactof our storage mechanism. This means supporting things like being able to re-order the accepted values in an enum.But like I said, I just don't see the use case for doing that. >> >> > >So what do you propose we do for a default ordering? I hope you don't >think we should force a sort as though the enum labels were text... >That almost certainly incorrect for most applications of enums, which >are used to make opaque labels more human compatible. > > Yeah, lexical ordering is surely wrong. I believe that the enumeration order is the only correct ordering. If you wanted lexical ordering, you could always do "order by myenum::text". >MySQL's behavior of allowing the user to specify the collation in the >typedef makes a lot of sense to me, it doesn't matter that it actually >works as an artifact of the storage backend. I'd argue that it would >make sense to sort by the specification order even if we changed the >backend to use varchars rather than numbers. > > > Can't see the point in it, I must confess. cheers andrew
Jim C. Nasby wrote: > >Like I said, if we're going to support a concept of ordering of items in >an enum then we need to support it fully. For starters that means having >the ability to re-order things in an enum seamlessly. > > I do not see this at all. An enumeration defines an ordering and a set of labels. Why should you be able to change it? If you want a different ordering, create a new enumeration. Let's do this right because it's a feature worth having, not just mimic the competition's idiocy. cheers andrew
On Thu, Oct 27, 2005 at 07:02:45PM -0400, Andrew Dunstan wrote: > Jim C. Nasby wrote: > >On another note, I noticed that the comparison operators seem to be > >comparing the underlying numeric value used to store the enum, which is > >wrong IMO. Consider: > > > >ENUM color '"red","blue","green"' > >CREATE TABLE t (c color); > >INSERT INTO t VALUES('blue'); > >INSERT INTO t VALUES('green'); > >INSERT INTO t VALUES('red'); > >SELECT c FROM t ORDER BY c; > >red > >blue > >green > > > >That seems counter-intuitive. It's also exposing an implimentation > >detail (that the enum is stored internally as a number). > > No it is not. Not in the slightest. It is honoring the enumeration order > defined for the type. That is the ONLY correct behaviour, IMNSHO. I agree. Honoring the enumeration order makes sense if you consider the values as things that should be ordered based on some property of their thingness instead of based on what their labels happen to be in a particular language. If I have an enumeration of colors I might want values sorted by their position in the spectrum, so whether the labels are (red, green, blue) or (gorri, berde, urdin) I might want to maintain that particular order. If you want values ordered lexically then you can enumerate them that way. Why force that behavior on people who want to order based on some other criteria? -- Michael Fuhr
On Thursday 2005-10-27 16:22, Andrew Dunstan wrote: > Jim C. Nasby wrote: > >Like I said, if we're going to support a concept of ordering of items in > >an enum then we need to support it fully. For starters that means having > >the ability to re-order things in an enum seamlessly. > > I do not see this at all. An enumeration defines an ordering and a set > of labels. Why should you be able to change it? If you want a different > ordering, create a new enumeration. Let's do this right because it's a > feature worth having, not just mimic the competition's idiocy > The symbols in the set have no _per se_ order. A collation rule is necessary to sort the symbols consistently. ASCII is an enumeration Unicode is a large enumeration with a simple naive collation and a complex default collation. Defining a set results in an unordered specification of symbols. Defining a collation produces an ordering for the set. There can be many collations for a set. An enumeration is just a computer science short-hand way to define a set and a "native" collation for the set. An enumeration's native collation need not be the only, or even the most common, collation for the enumerated set of symbols.
On Oct 28, 2005, at 9:23 , Trent Shipley wrote: > On Thursday 2005-10-27 16:22, Andrew Dunstan wrote: > >> Jim C. Nasby wrote: >> >>> Like I said, if we're going to support a concept of ordering of >>> items in >>> an enum then we need to support it fully. For starters that means >>> having >>> the ability to re-order things in an enum seamlessly. >>> >> >> I do not see this at all. An enumeration defines an ordering and a >> set >> of labels. Why should you be able to change it? If you want a >> different >> ordering, create a new enumeration. Let's do this right because >> it's a >> feature worth having, not just mimic the competition's idiocy >> >> > > The symbols in the set have no _per se_ order. > A collation rule is necessary to sort the symbols consistently. > ASCII is an enumeration > Unicode is a large enumeration with a simple naive collation and a > complex > default collation. > > Defining a set results in an unordered specification of symbols. > Defining a collation produces an ordering for the set. > There can be many collations for a set. > > An enumeration is just a computer science short-hand way to define > a set and a > "native" collation for the set. > An enumeration's native collation need not be the only, or even the > most > common, collation for the enumerated set of symbols. Relational databases already have a type for unordered sets: tables. IMO, if there's going to be a separate enumerated type, it should be more than just an alternative way of defining a set of key-value pairs. Michael Glaesemann grzm myrealbox com
Trent Shipley wrote: > >An enumeration is just a computer science short-hand way to define a set and a >"native" collation for the set. >An enumeration's native collation need not be the only, or even the most >common, collation for the enumerated set of symbols. > > > > No it's not. Many languages define enumerated types as having a fixed ordering. It is not just a set. That is the sense in which I am using the term. And, after all, you can always write a function that gives you an alternative ordering. All we are building in is an ordering based in the enumeration order, which you are at perfect liberty not to use. Postgres lets you define an alternative operator class for any type, so we are very flexible. People are getting way too hung up over this. cheers andrew
On Thu, Oct 27, 2005 at 05:41:01PM -0600, Michael Fuhr wrote: > If you want values ordered lexically then you can enumerate them > that way. Why force that behavior on people who want to order based > on some other criteria? Well, I was arguing about the default behavior. I'd bet that we're going to have a constant set of people wondering why ORDER BY is doing the 'wrong thing' when ordering an ENUM, which is why I argued that the default behavior should be ordering based on the external type, not how we're storing it or some other order. But I'm clearly in the minority in this view, so I'm droping it. :) On to other issues... Andrew, you mentioned that if you want to change the ordering you should just create a new type. What about if you need to change the values that are in the enum? MySQL does (or at least did, it's been some time since I've messed with this) a horrible job at that. There's no way to rename anything; you have to add the new names you want, then do a bulk update, then delete the (now old) names. IMO this is broken. Also, if we are going to maintain ordering and mapping (presumably via the internal number that we're storing), then I think we should expose that, at least optionally. So for example, you should be able to define what a specific enum value means. Not everyone will want a linear numbering starting at 0 afterall. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
What about use the declaration order as the enum order?, for example: if I declare something like: "CREATE ENUM hola ('item1', 'item3', 'item2');" -this is just assuming an hypothetical approach to use enum types in this way- and the logical order of the items could be 'item1', 'item3', 'item2' just because the user decided to create them in this way. Well, that's just an idea... -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Andrew Dunstan Sent: Jueves, 27 de Octubre de 2005 07:06 p.m. To: tshipley@deru.com Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] enums Trent Shipley wrote: > >An enumeration is just a computer science short-hand way to define a set and a >"native" collation for the set. >An enumeration's native collation need not be the only, or even the most >common, collation for the enumerated set of symbols. > > > > No it's not. Many languages define enumerated types as having a fixed ordering. It is not just a set. That is the sense in which I am using the term. And, after all, you can always write a function that gives you an alternative ordering. All we are building in is an ordering based in the enumeration order, which you are at perfect liberty not to use. Postgres lets you define an alternative operator class for any type, so we are very flexible. People are getting way too hung up over this. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.orgso that your message can get through to the mailing list cleanly
Jim C. Nasby wrote: >Andrew, you mentioned that if you want to change the ordering you should >just create a new type. What about if you need to change the values that >are in the enum? MySQL does (or at least did, it's been some time since >I've messed with this) a horrible job at that. There's no way to rename >anything; you have to add the new names you want, then do a bulk update, >then delete the (now old) names. IMO this is broken. > > It would just be a standard "ALTER TABLE foo ALTER COLUMN bar TYPE newtype USING expression" operation. You would write a function that took a value of the old type and returned a value of the new type and use a cll to that function in the expression. Since these would be named types, unlike the case in mysql where they are anonymously defined inline, this would present no difficulties at all. >Also, if we are going to maintain ordering and mapping (presumably via >the internal number that we're storing), then I think we should expose >that, at least optionally. So for example, you should be able to define >what a specific enum value means. Not everyone will want a linear >numbering starting at 0 afterall. > > What on earth for? Users should not care in the slightest what the internal representation is . Users who want a map where the values are exposed should create a lookup table. You keep saying that we are using the internal representation as the ordering. This is simply the wrong way to look at it. The internal representation REFLECTS the ordering; it doesn't impose it. The user has imposed the ordering when defining the type. In my enumkit I did provide a function that gave back the internal representation, but I am not by any means certain that that's a good idea. cheers andrew
Cristian Prieto wrote: >What about use the declaration order as the enum order?, for example: if I >declare something like: "CREATE ENUM hola ('item1', 'item3', 'item2');" >-this is just assuming an hypothetical approach to use enum types in this >way- and the logical order of the items could be 'item1', 'item3', 'item2' >just because the user decided to create them in this way. > >Well, that's just an idea... > > > Using the declaration order is exactly what I have done. cheers andrew
On Thursday 2005-10-27 17:39, Michael Glaesemann wrote: > On Oct 28, 2005, at 9:23 , Trent Shipley wrote: > > On Thursday 2005-10-27 16:22, Andrew Dunstan wrote: > >> Jim C. Nasby wrote: <snip/> > Relational databases already have a type for unordered sets: tables. > IMO, if there's going to be a separate enumerated type, it should be > more than just an alternative way of defining a set of key-value pairs. > > Michael Glaesemann > grzm myrealbox com Of course, what is an enumeration except an *ordered* list of key-value pairs; that is, a set with a built-in collation. Glaesemann is right. To me that implies that no database NEEDS an enumeration type. We just started discussing it because it would greatly enhance MySQL migration. Even more important, enumerations would be tremendously convenient (even if they are not strictly necessary). Enumerations would be good marketing and good engineering. The debate is about implementation: 1) Pure list (seems to be why MySQL does). Each enumeration is a list. If you want to create a variant collation for the list, create a new, parallel list. This is very straightforward but violates the "store once, read many times" principle of database design. 2) Hybrid list + secondary collations. You create and store an enumeration (call it enum_a). If you want to reorder the underlying set, just declare a new collation for the enumeration (call the result enum_b). enum_b is effectively a virtual enumeration. The relationship of enum_b to enum_a is like that between a view and its table. On the downside, this approach is a theoretical stew. It should be relatively easy to implement. 3) Set + collation functions. You define a set. You define a collation for the set. Having declared set_a and a collation_a you can then declare enumeration_a. (Note that the result defined by the developer's collation function may not necessarily result in a _per se_ enumeration.) This has the appeal of separating the symbol declaration from its ordering. Furthermore, of all the options it is the most powerful. Unfortunately, it may be verbose, unintuitive, and the most difficult to implement. There is probably no reason approach #1 or #2 could not be implemented using the machinery for approach #3 under the bonnet. That way we could have something like: CREATE SYMBOL SET {possibly a disguised create table, but probably not for performance reasons} CREATE COLLATION USING function_name CREATE ENUMERATION and CREATE MYSQL_ENUMERATION. {probably just overload CREATE ENUMERATION} =============================== http://dev.mysql.com/doc/refman/5.1/en/string-type-overview.html http://dev.mysql.com/doc/refman/5.1/en/enum.html (Note that unlike C enumerations MySql enumerations are two way and do some context dependent magic.)
On Thu, Oct 27, 2005 at 09:45:05PM -0400, Andrew Dunstan wrote: > > > Jim C. Nasby wrote: > > >Andrew, you mentioned that if you want to change the ordering you should > >just create a new type. What about if you need to change the values that > >are in the enum? MySQL does (or at least did, it's been some time since > >I've messed with this) a horrible job at that. There's no way to rename > >anything; you have to add the new names you want, then do a bulk update, > >then delete the (now old) names. IMO this is broken. > > > > > > > It would just be a standard "ALTER TABLE foo ALTER COLUMN bar TYPE > newtype USING expression" operation. You would write a function that > took a value of the old type and returned a value of the new type and > use a cll to that function in the expression. Since these would be named > types, unlike the case in mysql where they are anonymously defined > inline, this would present no difficulties at all. But why force a re-write of the entire table just to change the name of something? Or is ALTER COLUMN TYPE smart enough to not touch anything if the mapping function is equality? > >Also, if we are going to maintain ordering and mapping (presumably via > >the internal number that we're storing), then I think we should expose > >that, at least optionally. So for example, you should be able to define > >what a specific enum value means. Not everyone will want a linear > >numbering starting at 0 afterall. > > > > > > What on earth for? Users should not care in the slightest what the > internal representation is . Users who want a map where the values are > exposed should create a lookup table. > > You keep saying that we are using the internal representation as the > ordering. This is simply the wrong way to look at it. The internal > representation REFLECTS the ordering; it doesn't impose it. The user has > imposed the ordering when defining the type. In my enumkit I did provide > a function that gave back the internal representation, but I am not by > any means certain that that's a good idea. Well, someone was arguing that enum should be used as a convenient way to map human labels on a set of values. To me, that means you should be able to define exactly what that set of values is. Personally, I don't see why enum can't just be syntactic sugar on top of a side-table of values and a foreign key. And I guess a view to hide the internals from normal viewing. That would certainly allow the most flexibility, although it probably wouldn't perform as well as what you wrote. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Ted Rolle wrote: > This little snippet is great! The only problem I see is that the > enums must be consistent across all modules. > > What about loading a variable with a "default" value? Then it could > be adjusted to 'play'. > > You can set a default for a variable using one of these types, as you can for any other postgres type: create table bar (color rainbow default 'blue'); insert into bar values(default); As for any postgres type, the default must be a valid value for the type. cheers andrew
> > It would just be a standard "ALTER TABLE foo ALTER COLUMN bar TYPE > > newtype USING expression" operation. You would write a function that > > took a value of the old type and returned a value of the new type and > > use a cll to that function in the expression. Since these would be named > > types, unlike the case in mysql where they are anonymously defined > > inline, this would present no difficulties at all. > > But why force a re-write of the entire table just to change the name of > something? > > Or is ALTER COLUMN TYPE smart enough to not touch anything if the > mapping function is equality? Nearly all ALTER TABLE commands are processed in the same way. In fact, in some cases they are combined (ADD column is broken down then re-combined later for a single rewrite). You could pretty easily add this type of logic to skip the rewrite stage if not needed. --
Jim C. Nasby wrote: >On Thu, Oct 27, 2005 at 09:45:05PM -0400, Andrew Dunstan wrote: > > >>Jim C. Nasby wrote: >> >> >> >>>Andrew, you mentioned that if you want to change the ordering you should >>>just create a new type. What about if you need to change the values that >>>are in the enum? MySQL does (or at least did, it's been some time since >>>I've messed with this) a horrible job at that. There's no way to rename >>>anything; you have to add the new names you want, then do a bulk update, >>>then delete the (now old) names. IMO this is broken. >>> >>> >>> >>> >>It would just be a standard "ALTER TABLE foo ALTER COLUMN bar TYPE >>newtype USING expression" operation. You would write a function that >>took a value of the old type and returned a value of the new type and >>use a cll to that function in the expression. Since these would be named >>types, unlike the case in mysql where they are anonymously defined >>inline, this would present no difficulties at all. >> >> > >But why force a re-write of the entire table just to change the name of >something? > > Because you are not just changing the name of something. You can't decide you want to insert a new digit between 3 and 4 for an integer column and just affect a few rows, rather than change the whole set of rows for that field. Nor change the order of the integers. An enumerated type is just like that, except that the values are labels instead of digit sequences. >> >>You keep saying that we are using the internal representation as the >>ordering. This is simply the wrong way to look at it. The internal >>representation REFLECTS the ordering; it doesn't impose it. The user has >>imposed the ordering when defining the type. In my enumkit I did provide >>a function that gave back the internal representation, but I am not by >>any means certain that that's a good idea. >> >> > >Well, someone was arguing that enum should be used as a convenient way >to map human labels on a set of values. To me, that means you should be >able to define exactly what that set of values is. > > Well, that's not my conception at all. Then it is not an enumeration in my view. >Personally, I don't see why enum can't just be syntactic sugar on top of >a side-table of values and a foreign key. And I guess a view to hide the >internals from normal viewing. That would certainly allow the most >flexibility, although it probably wouldn't perform as well as what you >wrote. > > The other issue is ease of use. We used lookup tables in bugzilla when it was converted to work with Postgres. But many users will find having to do that annoying, to say the least. I think there's a very good case for providing true enums. There is a technical part of the puzzle I can't quite see yet, though :-) cheers andrew
> The other issue is ease of use. > > We used lookup tables in bugzilla when it was converted to work with > Postgres. But many users will find having to do that annoying, to say > the least. I think there's a very good case for providing true enums. Then why did you use lookup tables instead of a varchar and a constraint? Probably performance. A much more general purpose but just as good solution would be the ability to create a hidden surrogate key for a structure. CREATE TABLE status (code varchar(20) PRIMARY KEY) WITH SURROGATE; CREATE TABLE account (name varchar(60), status varchar(20) references status); Behind the scenes (transparent to the user) this gets converted to: CREATE TABLE status (id SERIAL UNIQUE, code varchar(20) PRIMARY KEY) WITH SURROGATE; CREATE TABLE account (name varchar(60), status integer references status(id)); SELECT * FROM account; would be rewritten as SELECT * FROM (SELECT name, code FROM account JOIN status USING (id)) AS account; Enum might be good for a short list of items but something like the above should be good for any common value that we manually create surrogate keys for today but without the clutter or the application needing to know. If PostgreSQL had an updatable view implementation it would be pretty simple to implement. --
On 10/27/05, Andrew Dunstan <andrew@dunslane.net> wrote: > >That seems counter-intuitive. It's also exposing an implimentation > >detail (that the enum is stored internally as a number). > > No it is not. Not in the slightest. It is honoring the enumeration order > defined for the type. That is the ONLY correct behaviour, IMNSHO. > Otherwise, you could just as easily use a domain with a check constraint. > > In fact, mysql's behaviour is laughably, even ludicrously, inconsistent: [snip] > So for "order by" it honors the enumeration order, but for < it uses the > lexical ordering. Lovely, eh? Oh wow. That is broken, I didn't try that case because I figured it would do it right (i.e. use the enum order).
On 10/27/05, Andrew Dunstan <andrew@dunslane.net> wrote: > >Yes, MySQL is broken in some regards, as usual. However, the API isn't > >bad (except for the fact that it doesn't care what invalid crap you > >throw at it), and more importantly there are thousands of apps and > >developers who think around that interface. We should copy it without > >the brokenness as much as possible unless we have good cause > >otherwise. > > > > mmm ... no. It is too broken. We should do enums orthogonally with other > type definitions in PostgreSQL. Where I would like to get to is that we > have a flavor of CREATE TYPE that will create the enum type for us, > including all the support that I build into my little kit. And if you > want to change the enumeration set on a column, you would use ALTER > TABLE foo ALTER COLUMN bar TYPE newtype USING ... eh, Well that we have a reasonable user extensiable type system is reasonable reason. What I was mostly objecting to was the use of lexical collation the "don't mess with what people already expect" argument was just the most handy strawman available. :) And in doing so you could insert a enum in the middle of the existing list without breaking the values already in the table? If so that would be very useful. > Inline declarations of enums does not strike me as good. You're right, it's a property of a type.
Gregory Maxwell wrote: >And in doing so you could insert a enum in the middle of the existing >list without breaking the values already in the table? If so that >would be very useful. > > > You do it by altering the column type, not by altering the type itself. MySQL's way of doing this is made necessary by its horrid non-orthogonal way of doing enums. Here's how it works in PostgreSQL. (To make this example work I had to add a text conversion - an inadvertant omission from the original. This is in a revised version of the enumkit, available at the same location.) andrew=# create table foo (i serial, c rgb); NOTICE: CREATE TABLE will create implicit sequence "foo_i_seq" for serial column "foo.i" CREATE TABLE andrew=# insert into foo (c) values ('blue'); INSERT 8711471 1 andrew=# insert into foo (c) values ('green'); INSERT 8711472 1 andrew=# insert into foo (c) values ('red'); INSERT 8711473 1 andrew=# select * from foo order by c;i | c ---+-------3 | red2 | green1 | blue (3 rows) andrew=# insert into foo (c) values ('yellow'); ERROR: invalid input value for enum: "yellow" andrew=# alter table foo alter column c type rainbow using c::text; ALTER TABLE andrew=# select * from foo order by c;i | c ---+-------3 | red2 | green1 | blue (3 rows) andrew=# insert into foo (c) values ('yellow'); INSERT 8711477 1 andrew=# select * from foo order by c;i | c ---+--------3 | red4 | yellow2 | green1 | blue (4 rows) cheers andrew
Andrew wrote: > > Jim C. Nasby wrote: > >Personally, I don't see why enum can't just be syntactic sugar on top of > >a side-table of values and a foreign key. And I guess a view to hide the > >internals from normal viewing. That would certainly allow the most > >flexibility, although it probably wouldn't perform as well as what you > >wrote. > The other issue is ease of use. > > We used lookup tables in bugzilla when it was converted to work with > Postgres. But many users will find having to do that annoying, to say > the least. I think there's a very good case for providing true enums. > There is a technical part of the puzzle I can't quite see yet, though :-) Hm, I agree with Jim here. IMO, enum=FK syntax sugar...enum should be a lookup table with two fields, one being enum value which is the PK, and two being the sequencing value. I think many people are opposed to this approach because they assume this relationship is via ID-ID link (IIRC this is what mysql does under the hood). In fact, the enum table's only purpose is for constraint checking, not to lookup the value (there is no 'id'). I like the way sequences work. They are first class SQL objects although they are normally accessed via helper functions. Enums could be the same. Dependancy could be preserved to the creating table or not (I prefer not). Merlin
Rod Taylor wrote: >>The other issue is ease of use. >> >>We used lookup tables in bugzilla when it was converted to work with >>Postgres. But many users will find having to do that annoying, to say >>the least. I think there's a very good case for providing true enums. >> >> > >Then why did you use lookup tables instead of a varchar and a >constraint? Probably performance. > > To be honest, I forget why. Possible because we also needed to be able to get a list of allowed values, although I don't know how one does that in mysql. Maybe because it just seemed like a good idea at the time and nobody spoke up against it. >A much more general purpose but just as good solution would be the >ability to create a hidden surrogate key for a structure. > >CREATE TABLE status (code varchar(20) PRIMARY KEY) WITH SURROGATE; >CREATE TABLE account (name varchar(60), status varchar(20) references >status); > >Behind the scenes (transparent to the user) this gets converted to: > >CREATE TABLE status (id SERIAL UNIQUE, code varchar(20) PRIMARY KEY) >WITH SURROGATE; >CREATE TABLE account (name varchar(60), status integer references >status(id)); > > >SELECT * FROM account; would be rewritten as >SELECT * FROM (SELECT name, code FROM account JOIN status USING (id)) AS >account; > >Enum might be good for a short list of items but something like the >above should be good for any common value that we manually create >surrogate keys for today but without the clutter or the application >needing to know. > >If PostgreSQL had an updatable view implementation it would be pretty >simple to implement. > > > That won't make it easier to change the ordering or the value set, which some people seem concerned about. But it too might be a nice feature. I suspect it would be a lot more work than simple enums, for which there is significant demand. cheers andrew
On Thu, Oct 27, 2005 at 11:07:19PM -0400, Rod Taylor wrote: > > The other issue is ease of use. > > > > We used lookup tables in bugzilla when it was converted to work with > > Postgres. But many users will find having to do that annoying, to say > > the least. I think there's a very good case for providing true enums. > > Then why did you use lookup tables instead of a varchar and a > constraint? Probably performance. > > A much more general purpose but just as good solution would be the > ability to create a hidden surrogate key for a structure. > > CREATE TABLE status (code varchar(20) PRIMARY KEY) WITH SURROGATE; > CREATE TABLE account (name varchar(60), status varchar(20) references > status); > > Behind the scenes (transparent to the user) this gets converted to: > > CREATE TABLE status (id SERIAL UNIQUE, code varchar(20) PRIMARY KEY) > WITH SURROGATE; > CREATE TABLE account (name varchar(60), status integer references > status(id)); > > > SELECT * FROM account; would be rewritten as > SELECT * FROM (SELECT name, code FROM account JOIN status USING (id)) AS > account; > > Enum might be good for a short list of items but something like the > above should be good for any common value that we manually create > surrogate keys for today but without the clutter or the application > needing to know. > > If PostgreSQL had an updatable view implementation it would be pretty > simple to implement. I'm not quiet following the WITH SURROGATE bit, but what you've described certainly looks valuable. Note that I would still want to be able to get at the raw numeric values in some fasion. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, Oct 27, 2005 at 10:34:57PM -0400, Andrew Dunstan wrote: > > > Jim C. Nasby wrote: > > >On Thu, Oct 27, 2005 at 09:45:05PM -0400, Andrew Dunstan wrote: > > > > > >>Jim C. Nasby wrote: > >> > >> > >> > >>>Andrew, you mentioned that if you want to change the ordering you should > >>>just create a new type. What about if you need to change the values that > >>>are in the enum? MySQL does (or at least did, it's been some time since > >>>I've messed with this) a horrible job at that. There's no way to rename > >>>anything; you have to add the new names you want, then do a bulk update, > >>>then delete the (now old) names. IMO this is broken. > >>> > >>> > >>> > >>> > >>It would just be a standard "ALTER TABLE foo ALTER COLUMN bar TYPE > >>newtype USING expression" operation. You would write a function that > >>took a value of the old type and returned a value of the new type and > >>use a cll to that function in the expression. Since these would be named > >>types, unlike the case in mysql where they are anonymously defined > >>inline, this would present no difficulties at all. > >> > >> > > > >But why force a re-write of the entire table just to change the name of > >something? > > > > > > Because you are not just changing the name of something. No, I was refering specifically to the case of wanting to rename something. IE: you setup an enum for sky colors (blue, black), and then the PHB issues an edict that the daytime sky is now green. In this case you (or at least I) don't want to define a new enum, I just want to change 'blue' to 'green' in that enum. There's no reason it needs to hit the table at all. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, 2005-10-28 at 13:20 -0500, Jim C. Nasby wrote: > On Thu, Oct 27, 2005 at 11:07:19PM -0400, Rod Taylor wrote: > > > The other issue is ease of use. > > > > > > We used lookup tables in bugzilla when it was converted to work with > > > Postgres. But many users will find having to do that annoying, to say > > > the least. I think there's a very good case for providing true enums. > > > > Then why did you use lookup tables instead of a varchar and a > > constraint? Probably performance. > > > > A much more general purpose but just as good solution would be the > > ability to create a hidden surrogate key for a structure. > > > > CREATE TABLE status (code varchar(20) PRIMARY KEY) WITH SURROGATE; > > CREATE TABLE account (name varchar(60), status varchar(20) references > > status); > > > > Behind the scenes (transparent to the user) this gets converted to: > > > > CREATE TABLE status (id SERIAL UNIQUE, code varchar(20) PRIMARY KEY) > > WITH SURROGATE; > > CREATE TABLE account (name varchar(60), status integer references > > status(id)); > > > > > > SELECT * FROM account; would be rewritten as > > SELECT * FROM (SELECT name, code FROM account JOIN status USING (id)) AS > > account; > > > > Enum might be good for a short list of items but something like the > > above should be good for any common value that we manually create > > surrogate keys for today but without the clutter or the application > > needing to know. > > > > If PostgreSQL had an updatable view implementation it would be pretty > > simple to implement. > > I'm not quiet following the WITH SURROGATE bit, but what you've > described certainly looks valuable. Note that I would still want to be > able to get at the raw numeric values in some fasion. The basic idea is that most of us break out schemas by creating fake primary keys for the purpose of obtaining performance because using the proper primary key (single or multiple columns) is often very slow. The automatic and transparent creation of a surrogate key by PostgreSQL would allow us to dramatically clean up the presentation of our schema to the users using the database without the performance hit we currently get. It puts surrogate keys (fake primary keys) back to the level of table spaces, indexes and other performance enhancements where they belong. --
Jim C. Nasby wrote: >>>> >>>> >>>But why force a re-write of the entire table just to change the name of >>>something? >>> >>> >>> >>> >>Because you are not just changing the name of something. >> >> > >No, I was refering specifically to the case of wanting to rename >something. IE: you setup an enum for sky colors (blue, black), and then >the PHB issues an edict that the daytime sky is now green. In this case >you (or at least I) don't want to define a new enum, I just want to >change 'blue' to 'green' in that enum. There's no reason it needs to hit >the table at all. > > Well, with enumkit you can't, because the values are hardwired in the .so file. With a builtin facility you would be able to, because the values would live in the catalog. However, hacking the catalog is not something I would encourage - what you are suggesting basically breaks the abstraction. But sure, it would be possible. I would not provide an SQL level facility to do it, though. My approved way to do it would be like the example I gave earlier. cheers andrew
On Fri, Oct 28, 2005 at 02:57:03PM -0400, Rod Taylor wrote: > The basic idea is that most of us break out schemas by creating fake > primary keys for the purpose of obtaining performance because using the > proper primary key (single or multiple columns) is often very slow. > > The automatic and transparent creation of a surrogate key by PostgreSQL > would allow us to dramatically clean up the presentation of our schema > to the users using the database without the performance hit we currently > get. > > > It puts surrogate keys (fake primary keys) back to the level of table > spaces, indexes and other performance enhancements where they belong. Ahh. Yes, that would definately be great to have. Although it would probably take me months if not years to get used to not seeing a bunch of _id fields laying all over the place... Is SURROGATE part of any of the ANSI specs? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Oct 28, 2005 at 04:12:01PM -0400, Andrew Dunstan wrote: > Well, with enumkit you can't, because the values are hardwired in the > .so file. With a builtin facility you would be able to, because the > values would live in the catalog. However, hacking the catalog is not > something I would encourage - what you are suggesting basically breaks > the abstraction. But sure, it would be possible. I would not provide an > SQL level facility to do it, though. My approved way to do it would be > like the example I gave earlier. Why not allow renaming though? It seems like a logical feature to have, and an easy one to add. What am I missing? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jim C. Nasby wrote: >On Fri, Oct 28, 2005 at 04:12:01PM -0400, Andrew Dunstan wrote: > > >>Well, with enumkit you can't, because the values are hardwired in the >>.so file. With a builtin facility you would be able to, because the >>values would live in the catalog. However, hacking the catalog is not >>something I would encourage - what you are suggesting basically breaks >>the abstraction. But sure, it would be possible. I would not provide an >>SQL level facility to do it, though. My approved way to do it would be >>like the example I gave earlier. >> >> > >Why not allow renaming though? It seems like a logical feature to have, >and an easy one to add. What am I missing? > > That it is not changing a name, but a value. It's roughly the equivalent of inserting a new digit between 3 and 4. Your "feature" breaks the abstraction I am trying to implement. cheers andrew
On Fri, 2005-10-28 at 15:21 -0500, Jim C. Nasby wrote: > On Fri, Oct 28, 2005 at 02:57:03PM -0400, Rod Taylor wrote: > > The basic idea is that most of us break out schemas by creating fake > > primary keys for the purpose of obtaining performance because using the > > proper primary key (single or multiple columns) is often very slow. > > > > The automatic and transparent creation of a surrogate key by PostgreSQL > > would allow us to dramatically clean up the presentation of our schema > > to the users using the database without the performance hit we currently > > get. > > > > > > It puts surrogate keys (fake primary keys) back to the level of table > > spaces, indexes and other performance enhancements where they belong. > > Ahh. Yes, that would definately be great to have. Although it would > probably take me months if not years to get used to not seeing a bunch > of _id fields laying all over the place... > > Is SURROGATE part of any of the ANSI specs? No, but neither is an index, rollback segment, or table space. The ANSI spec doesn't usually deal with performance tweaks that are the responsibility of the DBA. --
On Fri, Oct 28, 2005 at 04:36:26PM -0400, Rod Taylor wrote: > On Fri, 2005-10-28 at 15:21 -0500, Jim C. Nasby wrote: > > On Fri, Oct 28, 2005 at 02:57:03PM -0400, Rod Taylor wrote: > > > The basic idea is that most of us break out schemas by creating fake > > > primary keys for the purpose of obtaining performance because using the > > > proper primary key (single or multiple columns) is often very slow. > > > > > > The automatic and transparent creation of a surrogate key by PostgreSQL > > > would allow us to dramatically clean up the presentation of our schema > > > to the users using the database without the performance hit we currently > > > get. > > > > > > > > > It puts surrogate keys (fake primary keys) back to the level of table > > > spaces, indexes and other performance enhancements where they belong. > > > > Ahh. Yes, that would definately be great to have. Although it would > > probably take me months if not years to get used to not seeing a bunch > > of _id fields laying all over the place... > > > > Is SURROGATE part of any of the ANSI specs? > > No, but neither is an index, rollback segment, or table space. The ANSI > spec doesn't usually deal with performance tweaks that are the > responsibility of the DBA. True, but none of those other things you mention affect external representation of data. But I was more wondering if we were inventing syntax on the fly here or not... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, 2005-10-28 at 16:28 -0500, Jim C. Nasby wrote: > On Fri, Oct 28, 2005 at 04:36:26PM -0400, Rod Taylor wrote: > > On Fri, 2005-10-28 at 15:21 -0500, Jim C. Nasby wrote: > > > On Fri, Oct 28, 2005 at 02:57:03PM -0400, Rod Taylor wrote: > > > > The basic idea is that most of us break out schemas by creating fake > > > > primary keys for the purpose of obtaining performance because using the > > > > proper primary key (single or multiple columns) is often very slow. > > > > > > > > The automatic and transparent creation of a surrogate key by PostgreSQL > > > > would allow us to dramatically clean up the presentation of our schema > > > > to the users using the database without the performance hit we currently > > > > get. > > > > > > > > > > > > It puts surrogate keys (fake primary keys) back to the level of table > > > > spaces, indexes and other performance enhancements where they belong. > > > > > > Ahh. Yes, that would definately be great to have. Although it would > > > probably take me months if not years to get used to not seeing a bunch > > > of _id fields laying all over the place... > > > > > > Is SURROGATE part of any of the ANSI specs? > > > > No, but neither is an index, rollback segment, or table space. The ANSI > > spec doesn't usually deal with performance tweaks that are the > > responsibility of the DBA. > > True, but none of those other things you mention affect external > representation of data. But I was more wondering if we were inventing > syntax on the fly here or not... It isn't supposed to impact the external representation of the data and generally neither is an ENUM outside of the potential sorting ability. I was just getting the impression that the big push for enums was to be able to use a 'real word' but without a performance hit. A regular old table, foreign key to a varchar gives you the 'real word' and the surrogate key allows you to do so without a performance hit. --
On Fri, Oct 28, 2005 at 06:10:26PM -0400, Rod Taylor wrote: > It isn't supposed to impact the external representation of the data and > generally neither is an ENUM outside of the potential sorting ability. I > was just getting the impression that the big push for enums was to be > able to use a 'real word' but without a performance hit. > > A regular old table, foreign key to a varchar gives you the 'real word' > and the surrogate key allows you to do so without a performance hit. I think there's probably good use cases for each. If you've got something small like a status field, 'enum' might be better. For bigger things, SURROGATE could be nice syntactic sugar. Now that I finally understand what Andrew's been getting at with enums, I'm wondering if we might want to expand on the typical usage a bit. Looking at a plain-old C enum, you're just representing some magic labels with a number to save space. Things like say, SLRU_PAGE_CLEAN, SLRU_PAGE_READ_IN_PROGRESS, SLRU_PAGE_CLEAN. Those names are great from a code standpoint, but they're not something you'd typically want to display to the user. So, imho a useful extension would be to allow for enums to contain both the 'machine name' and a 'human name', where the human name could be renamed freely. To put this in a more concrete example; I hate the default priorities that ship with bugzilla; P1 - P5. Is 1 high or is 5? So I always rename them to Very Low, Low ... Very High. That means making changes both to the database and to the code. But if Bugzilla was using my idea of an enum then the code would refer to priorities with P1...P5 (or whatever else they wanted to call it) and I could easily change the human names to something that can't be confused. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Rod Taylor <pg <at> rbt.ca> writes: > The basic idea is that most of us break out schemas by creating fake > primary keys for the purpose of obtaining performance because using the > proper primary key (single or multiple columns) is often very slow. This is one thing I simply can't understand. If you still declare the natural key(s) as UNIQUEs, you have just made performance worse. Now there are two keys to be checked on UPDATEs and INSERTs, two indexes to be updated, and probably a SEQUENCE too. If you don't, you have just thrown away centralised, optimised integrity checking, and will probably have to remember to do a slower SELECT before updating. Certainly decoupling presentation from storage would be nice, but even before that generalised use of surrogate keys seems to me a knee-jerk reaction.
On Jan 13, 2006, at 21:42 , Leandro Guimarães Faria Corcete DUTRA wrote: > If you still declare the natural key(s) as UNIQUEs, you have just made > performance worse. Now there are two keys to be checked on UPDATEs > and > INSERTs, two indexes to be updated, and probably a SEQUENCE too. For UPDATEs and INSERTs, the "proper" primary key also needs to be checked, but keys are used for more than just checking uniqueness: they're also often used in JOINs. Joining against a single integer I'd think it quite a different proposition (I'd think faster in terms of performance) than joining against, say, a text column or a composite key. Michael Glaesemann grzm myrealbox com
Michael Glaesemann wrote: > > On Jan 13, 2006, at 21:42 , Leandro Guimarães Faria Corcete DUTRA wrote: > >> If you still declare the natural key(s) as UNIQUEs, you have just made >> performance worse. Now there are two keys to be checked on UPDATEs and >> INSERTs, two indexes to be updated, and probably a SEQUENCE too. > > For UPDATEs and INSERTs, the "proper" primary key also needs to be > checked, but keys are used for more than just checking uniqueness: > they're also often used in JOINs. Joining against a single integer I'd > think it quite a different proposition (I'd think faster in terms of > performance) than joining against, say, a text column or a composite key. Well this is a balancing decision. You certainly slow down inserts. You might also increase the stress on the table because you have to translate between the different keys. It also depends on the join type you end up doing. It also obviously depends on how large your original primary key is. However whatever your situation is: make sure you do not end up doing premature optimization. regards, Lukas
On Fri, 2006-01-13 at 12:42 +0000, Leandro Guimarães Faria Corcete DUTRA wrote: > Rod Taylor <pg <at> rbt.ca> writes: > > > The basic idea is that most of us break out schemas by creating fake > > primary keys for the purpose of obtaining performance because using the > > proper primary key (single or multiple columns) is often very slow. > > This is one thing I simply can't understand. > > If you still declare the natural key(s) as UNIQUEs, you have just made > performance worse. Now there are two keys to be checked on UPDATEs and > INSERTs, two indexes to be updated, and probably a SEQUENCE too. Indeed. Using a surrogate key is not free and that is why it would be something the DBA would specify during table creation. The main goal would be to give the option of using a surrogate key without being forced to expose it to the applications using the database. It is a feature akin to table spaces in that it can help performance but without the application or standard users knowing why. --
Rod Taylor wrote: >> If you still declare the natural key(s) as UNIQUEs, you have just made >> performance worse. Now there are two keys to be checked on UPDATEs and >> INSERTs, two indexes to be updated, and probably a SEQUENCE too. > > Indeed. Using a surrogate key is not free and that is why it would be > something the DBA would specify during table creation. > > The main goal would be to give the option of using a surrogate key > without being forced to expose it to the applications using the > database. It is a feature akin to table spaces in that it can help > performance but without the application or standard users knowing why. Just this morning my father came to me (he is getting into SQL now that he is retired) with an issue where a surrogate key probably makes sense. He is storing a tree of plant families that can get fairly deep. The primary key is the scientific name. In order to improve performance and get rid of the recursive lookups he currently does he now wants to use materialized paths (<parent name>/<sub name>/<sub sub name>). He decided not to go with nested paths since that makes it very hard to hand fix things in the tree structure. Obviously using the scientific name in the materialized paths can quickly give you a really wide column if you have a fairly deep tree. In that case it could be beneficial to introduce a surrogate key. The only annoying bit is that he frequently needs to sync with an external database where they use no surrogate key so the import slows down because he needs to check if a surrogate key has been introduced for every given scientific name before writing to the database. regards, Lukas
On Fri, Jan 13, 2006 at 12:42:55PM +0000, Leandro Guimarães Faria Corcete DUTRA wrote: > Rod Taylor <pg <at> rbt.ca> writes: > > The basic idea is that most of us break out schemas by creating fake > > primary keys for the purpose of obtaining performance because using the > > proper primary key (single or multiple columns) is often very slow. > This is one thing I simply can't understand. > If you still declare the natural key(s) as UNIQUEs, you have just made > performance worse. Now there are two keys to be checked on UPDATEs and > INSERTs, two indexes to be updated, and probably a SEQUENCE too. Not to completely defend the practice - but in some applications, INSERT is much less frequent than UPDATE, and that UPDATE requires a unique check on the primary key and the surrogate key, as well as an update, should be considered (and I believe is considered) a PostgreSQL performance bug. It's undesirable and unnecessary behaviour for the majority of uses (where they key does not change as a part of the update). > Certainly decoupling presentation from storage would be nice, but even before > that generalised use of surrogate keys seems to me a knee-jerk reaction. Yes, I agree. As per a previous thread, I'm one of those using it to generalize my query / update implementation into common base code. I have other reasons - but I confess to this being the real reason. In my case, the cost of maintaining the code that queries / updates is more expensive than the cost of having an extra unique index, and the storage and performance impacts this has on my data. :-) Is my primary reason good on its own, without the other more legitimate justifications? It's good enough for me. I expect others to strongly disagree. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
On Sat, Jan 14, 2006 at 11:06:07AM -0500, mark@mark.mielke.cc wrote: > Not to completely defend the practice - but in some applications, > INSERT is much less frequent than UPDATE, and that UPDATE requires a > unique check on the primary key and the surrogate key, as well as an > update, should be considered (and I believe is considered) a > PostgreSQL performance bug. It's undesirable and unnecessary behaviour > for the majority of uses (where they key does not change as a part of > the update). Unique check? An index is an index and when you do an UPDATE the new tuple has to be added to the index. At this point it doesn't matter if the index is unique or not, all indexes cost something. Since after the UPDATE the tuple with that primary key appears two (or more) times in the table, a check needs to be made that they don't overlap timewise. Are you claiming you could avoid this check and still guarentee correctness in the face of concurrent transactions? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Sat, Jan 14, 2006 at 09:33:39PM +0100, Martijn van Oosterhout wrote: > On Sat, Jan 14, 2006 at 11:06:07AM -0500, mark@mark.mielke.cc wrote: > > Not to completely defend the practice - but in some applications, > > INSERT is much less frequent than UPDATE, and that UPDATE requires a > > unique check on the primary key and the surrogate key, as well as an > > update, should be considered (and I believe is considered) a > > PostgreSQL performance bug. It's undesirable and unnecessary behaviour > > for the majority of uses (where they key does not change as a part of > > the update). > Unique check? An index is an index and when you do an UPDATE the new > tuple has to be added to the index. At this point it doesn't matter if > the index is unique or not, all indexes cost something. > Since after the UPDATE the tuple with that primary key appears two (or > more) times in the table, a check needs to be made that they don't > overlap timewise. Are you claiming you could avoid this check and still > guarentee correctness in the face of concurrent transactions? I'm claiming that I agree with this TODO item: - Prevent index uniqueness checks when UPDATE does not modify the column Uniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE. Definately, the check is unnecessary. If it was unique before we made the change, we know it will be unique after we've made the change. The check shouldn't be performed for the primary key, or for the surrogate key, if neither of these keys are modified in any way. Perhaps you are challenging my addition of the phrase "as well as an update", with a hint on my part, that I feel the update is unnecessary as well. I may have been wrong to add these 5 words. The MVCC implementation has numerous costs, and perhaps this is one of them that cannot be avoided. :-( Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
Leandro Guimarães Faria Corcete DUTRA <leandro@dutra.fastmail.fm> writes: > Certainly decoupling presentation from storage would be nice, but even before > that generalised use of surrogate keys seems to me a knee-jerk reaction. I hate knee-jerk reactions too, but just think of all the pain of people dealing with databases where they used Social Security numbers for primary keys. I would never use an attribute that represents some real-world datum as a primary key any more. In my experience there are very few occasions where I want a real non-sequence generated primary key. I've never regretted having a sequence generated primary key, and I've certainly had occasions to regret not having one. -- greg
On Sat, Jan 14, 2006 at 07:28:21PM +0900, Michael Glaesemann wrote: > > On Jan 13, 2006, at 21:42 , Leandro Guimar?es Faria Corcete DUTRA wrote: > > >If you still declare the natural key(s) as UNIQUEs, you have just made > >performance worse. Now there are two keys to be checked on UPDATEs > >and > >INSERTs, two indexes to be updated, and probably a SEQUENCE too. > > For UPDATEs and INSERTs, the "proper" primary key also needs to be > checked, but keys are used for more than just checking uniqueness: > they're also often used in JOINs. Joining against a single integer > I'd think it quite a different proposition (I'd think faster in terms > of performance) than joining against, say, a text column or a > composite key. a) the optimizer does a really poor job on multi-column index statistics b) If each parent record will have many children, the space savings from using a surrogate key can be quite large c) depending on how you view things, putting actual keys all over the place is denormalized Generally, I just use surrogate keys for everything unless performance dictates something else. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jim C. Nasby <jnasby <at> pervasive.com> writes: > a) the optimizer does a really poor job on multi-column index statistics So it should be fixed? And there are a *lot* of singular, natural keys. > b) If each parent record will have many children, the space savings from > using a surrogate key can be quite large Not such a common case. > c) depending on how you view things, putting actual keys all over the > place is denormalized How come? Never! > Generally, I just use surrogate keys for everything unless performance > dictates something else. What I am proposing is the reverse: use natural keys for everything unless performance dictates something else. In support of my PoV: http://blogs.ittoolbox.com/database/soup/archives/007327.asp?rss=1
Greg Stark <gsstark <at> mit.edu> writes: > I hate knee-jerk reactions too, but just think of all the pain of people > dealing with databases where they used Social Security numbers for primary > keys. I would never use an attribute that represents some real-world datum as > a primary key any more. I am not familiar with the situation. > In my experience there are very few occasions where I want a real non-sequence > generated primary key. I've never regretted having a sequence generated > primary key, and I've certainly had occasions to regret not having one. http://blogs.ittoolbox.com/database/soup/archives/007327.asp?rss=1
Andrew Dunstan <andrew <at> dunslane.net> writes: > If people would like to play, I have created a little kit to help in > creating first class enum types in a few seconds. Isn't what we actually want possreps?
On Wed, Jan 18, 2006 at 01:08:53PM +0000, Leandro Guimarães Faria Corcete DUTRA wrote: > Jim C. Nasby <jnasby <at> pervasive.com> writes: > > Generally, I just use surrogate keys for everything unless performance > > dictates something else. > > What I am proposing is the reverse: use natural keys for everything unless > performance dictates something else. > > In support of my PoV: > http://blogs.ittoolbox.com/database/soup/archives/007327.asp?rss=1 Interesting. However, in my experience very few things have "natural keys". There are no combination of attributes for people, phone calls or even real events that make useful natural keys. You don't say what the primary key on your events table was but I can see one possibility: (place,datetime) A unique on this won't prevent overlapping events. Sure, it'll get rid of the obvious duplicates but won't solve the problem. It also fails the criteria that keys stable, since you can move events. You do need a constraint on that table, but a unique constraint isn't it. While I agree with your statement that it's the abuse of these keys thats the problem, I find people are far too likely to see natural keys where none exist. BTW, the way I deal with people mixing up surrogate keys is by (usually by chance) having the sequences for different tables start at wildly different points. By starting one counter at a million and the other at one, the chances that you'll be able to mix them up is reduced. On some systems I can even identify the table a key comes from by looking at the number, just because I know only one table has keys in the 30,000 range. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Leandro Guimarães Faria Corcete DUTRA wrote: >Andrew Dunstan <andrew <at> dunslane.net> writes: > > > >>If people would like to play, I have created a little kit to help in >>creating first class enum types in a few seconds. >> >> > >Isn't what we actually want possreps? > > > > > You appear to be responding to mail from months ago. Please catch up before replying, so we don't rehash old discussions. As previously discussed, I intend to do first class enums for the next release of postgres, if I get enough time. Enumkit was just a very small step along the research road, although it is useful in itself, which is why I released it. cheers andrew
On Jan 18, 2006, at 22:08 , Leandro Guimarães Faria Corcete DUTRA wrote: > Jim C. Nasby <jnasby <at> pervasive.com> writes: > >> a) the optimizer does a really poor job on multi-column index >> statistics > > So it should be fixed? Of course! Patches welcome! Michael Glaesemann grzm myrealbox com
Leandro Guimarães Faria Corcete DUTRA <leandro@dutra.fastmail.fm> writes: > Greg Stark <gsstark <at> mit.edu> writes: > > > I hate knee-jerk reactions too, but just think of all the pain of people > > dealing with databases where they used Social Security numbers for primary > > keys. I would never use an attribute that represents some real-world datum as > > a primary key any more. > > I am not familiar with the situation. The US gov't handed out unique numbers to every worker for their old age pension program. Many early database designers thought that made a wonderful natural primary key. It turns out that: a) not everyone has a social insurance number: when their business expanded to include foreign nationals these databases had to make up fake social insurance numbers. b) Occasionally people's social insurance numbers change, either because they got it wrong in the first place or because of identity theft later on. Even dealing with it changing isn't good enough because the old records don't disappear; the person essentially has *two* social insurance numbers. c) For security reasons it turns out to be a bad idea to be passing around social insurance numbers in the first place. So these database designers had a major problem adapting when people started refusing to give them social insurance numbers or complaining when their application leaked their social insurance number. In short, what seemed like the clearest possible example of a natural primary key became a great example of how hard it is to deal with changing business requirements when you've tied your database design to the old rules. Using natural primary keys makes an iron-clad design assumption that the business rules surrounding that datum will never change. And the one thing constant in business is that business rules change. In the past I've used "username" as a primary key for a users table, what could be safer? Later we had to create a sequence generated userid column because some data partners couldn't handle an text column without corrupting it. And of course one day the question arose whether we could handle someone wanting to change their username. Then another day we were asked whether we could have two different people with the same username if they belonged to separate branded subsites. -- greg
On Wed, Jan 18, 2006 at 01:08:53PM +0000, Leandro Guimar??es Faria Corcete DUTRA wrote: > > b) If each parent record will have many children, the space savings from > > using a surrogate key can be quite large > > Not such a common case. Hmmm... Many blog entries per user... Many blog comments per entry Many PO's per customer... many line items per PO... Etc., etc. I would argue that one-many relationships are far more common than one-one, and it's very common for an integer ID to be a more compact representation than a real key. > > c) depending on how you view things, putting actual keys all over the > > place is denormalized > > How come? Never! Huh? One of the tenants of normalization is that you don't repeat data. You don't use customer name in your PO table, because it's asking for problems; what if a customer changes names (as just one example). > > Generally, I just use surrogate keys for everything unless performance > > dictates something else. > > What I am proposing is the reverse: use natural keys for everything unless > performance dictates something else. > > In support of my PoV: > http://blogs.ittoolbox.com/database/soup/archives/007327.asp?rss=1 Read the bottom of it: "I am not saying that you should avoid autonumber surrogate keys like an SCO executive. The danger is not in their use but in their abuse. The "events_id" column in the "events" table didn't give us any trouble until we began to rely on it as the sole key for the table. The accounting application gave us problems because we were using the ID as the entire handle for the records. That crossed the line from use to misuse, and we suffered for it." To paraphrase, the issue isn't that surrogate keys were used for RI; the issue is that proper keys were not setup to begin with. Does it make sense to have a customer table where customer_name isn't unique? Almost certainly not. But that's just one possible constraint you might put on that table. To put words in Josh's mouth, the issue isn't with using a surrogate key, it's with not thinking about what constraints you should be placing on your data. Take a look at cbk's comment; he does a great job of summing the issue up. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Martjin, > Interesting. However, in my experience very few things have "natural > keys". There are no combination of attributes for people, phone calls > or even real events that make useful natural keys. I certainly hope that I never have to pick up one of your projects. A table without a natural key is a data management disaster. Without a key, it's not data, it's garbage. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
On Wed, Jan 18, 2006 at 03:58:50PM -0800, Josh Berkus wrote: > Martjin, > > > Interesting. However, in my experience very few things have "natural > > keys". There are no combination of attributes for people, phone calls > > or even real events that make useful natural keys. > > I certainly hope that I never have to pick up one of your projects. A > table without a natural key is a data management disaster. Without a > key, it's not data, it's garbage. ??? Please provides natural keys for any of the following: - A Person - A phone call: (from,to,date,time,duration) is not enough - A physical address - A phone line: (phone numbers arn't unique over time) - An internet account: (usernames not unique over time either) In any of these either misspellings, changes of names, ownership or even structure over time render the obvious useless as keys. There are techniques for detecting and reducing duplication but the point is that for any of these duplicates *can* be valid data. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martjin, > In any of these either misspellings, changes of names, ownership or > even structure over time render the obvious useless as keys. There are > techniques for detecting and reducing duplication but the point is that > for any of these duplicates *can* be valid data. Please point me out where, in the writings of E.F. Codd or in the SQL Standard, it says that keys have to be immutable for the life of the row. Duplicate *values* can be valid data. Duplicate *tuples* show some serious flaws in your database design. If you have a personnel directory on which you've not bothered to define any unique constraints other than the ID column, then you can't match your data to reality. If you have two rows with the same first and last name, you don't know if they are two different people or the same person, duplicated. Which will be a big problem come paycheck time. Per E.F. Codd, each tuple is a *unique* predicate (or key) comprising a set of values definining a *unique* data entity. i.e. "The employeee named "John" "Little" at extension "4531". There is nothing anywhere said about keys never changing. This is Databases 101 material. Really! --Josh
On Thu, Jan 19, 2006 at 10:09:26AM -0800, Josh Berkus wrote: > Martjin, > > >In any of these either misspellings, changes of names, ownership or > >even structure over time render the obvious useless as keys. There are > >techniques for detecting and reducing duplication but the point is that > >for any of these duplicates *can* be valid data. > > Please point me out where, in the writings of E.F. Codd or in the SQL > Standard, it says that keys have to be immutable for the life of the row. Possibly nowhere. But when you send invoices to customers, any details on there *are* immutable. Sure, in your database you don't care if things change, but then they don't match reality anymore do they? > Duplicate *values* can be valid data. Duplicate *tuples* show some > serious flaws in your database design. If you have a personnel > directory on which you've not bothered to define any unique constraints > other than the ID column, then you can't match your data to reality. If > you have two rows with the same first and last name, you don't know if > they are two different people or the same person, duplicated. Which > will be a big problem come paycheck time. I never said there were duplicate tuples, just that the data has no natural keys. The tuples are unique because there's a surrogate key. It is entirely possible to have two people with the same first name, last name and date of birth. Rather uncommon, but the database must be able to support it. I don't understand your example though. If you have a personnel directory with two rows with the same first and last name, what does that tell you. Nothing. You have to go find out whether there really are two of those people or not. You can simplify the process by taking into account the fact that it's very unlikely, but a unique constraint is not the answer. Besides, it's far more likely the same person will appear twice with two different spellings of their name. :) Anyway, the discussion was about surrogate vs natural keys. Nothing here has convinced me that there are any useful natural keys to be found in the examples I gave. Most of the examples I gave come from a system I had to maintain where some designer had assumed there was some kind of natural key and in *each* and *every* case it caused problems... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Em Qui, 2006-01-19 às 22:29 +0100, Martijn van Oosterhout escreveu: > Possibly nowhere. But when you send invoices to customers, any details > on there *are* immutable. Sure, in your database you don't care if > things change, but then they don't match reality anymore do they? Then what you need is a temporal database -- at least some form of historical records. Nothing to do with keys in themselves. > I never said there were duplicate tuples, just that the data has no > natural keys. The tuples are unique because there's a surrogate key. This does not guarantee uniqueness, as the key is artificially and internally generated. > It > is entirely possible to have two people with the same first name, last > name and date of birth. Rather uncommon, but the database must be able > to support it. And the way to support it is to take into account additional data -- place of birth, parents' data etc -- as part of the candidate keys. Not to allow duplicates. > I don't understand your example though. If you have a personnel > directory with two rows with the same first and last name, what does > that tell you. Nothing. You have to go find out whether there really > are two of those people or not. And how will you do that if you don't store additional data? > You can simplify the process by taking > into account the fact that it's very unlikely, but a unique constraint > is not the answer. Oh yes, it is. They only one. > Besides, it's far more likely the same person will > appear twice with two different spellings of their name. :) So what? -- +55 (11) 5685 2219 xmpp:leandrod@jabber.org +55 (11) 9406 7191 Yahoo!: lgcdutra +55 (11) 5686 9607 MSN: leandro@dutra.fastmail.fm +55 (11) 4390 5383 ICQ/AIM: 61287803