Thread: How to modify ENUM datatypes?
All, In the past I have used foreign keys to lookup tables for small lists of values that I now think ENUM could do the job of. I was hoping that by using ENUM, I would avoid having to do joins in my queries, and that I'd be making the data more robust and faster. I used to have a table for account_status: A | Active B | Billing Failed C | Closed D | Deactivated account.acct_type CHAR references account_type.acct_type CHAR But, now I've converted that to an ENUM: ACTIVE BILLING_FAILED CLOSED DEACTIVATED account.acct_type ENUM account_type The problem is that once I create a column in my account table that uses this 'account_type' datatype, I can't seem to change or add to it any more. I want to add a new value or edit/delete an existing one. How do you make changes to an ENUM datatype that is already in use? -- Dante
D. Dante Lorenso wrote: > > The problem is that once I create a column in my account table that > uses this 'account_type' datatype, I can't seem to change or add to it > any more. I want to add a new value or edit/delete an existing one. > > How do you make changes to an ENUM datatype that is already in use? > As far as I know ENUM is not well suited to uses where new enumeration members may be added later. A lookup table and a foreign key is probably better for this sort of use. You *can* add elements to an enum type - sort of - by creating a new type and converting columns. It's ugly, though, and will be hard to get right when the column of interest is referenced by foreign keys and such. One way to do it if you really must: -- Starting state CREATE TYPE et1 AS ENUM('yes','no'); CREATE TABLE testtab ( a et ); INSERT INTO testtab (a) values ('yes'); -- Change CREATE TYPE et2 AS ENUM('yes','no','filenotfound'); ALTER TABLE testtab ALTER COLUMN a TYPE et2 USING ( CASE a WHEN 'yes' THEN 'yes'::et2 WHEN 'no' THEN 'no'::et2 END ); -- Alternative ALTER that's suitable if you're not removing anything from the enum ALTER TABLE testtab ALTER COLUMN a TYPE et2 USING ( a::text::et2 ); Personally, though, I'd stick to the good 'ol lookup table and foreign key. -- Craig Ringer
On Tue, 22 Apr 2008 20:58:30 +0800 Craig Ringer wrote: > D. Dante Lorenso wrote: > > > > The problem is that once I create a column in my account table that > > uses this 'account_type' datatype, I can't seem to change or add to it > > any more. I want to add a new value or edit/delete an existing one. > > > > How do you make changes to an ENUM datatype that is already in use? > > > As far as I know ENUM is not well suited to uses where new enumeration > members may be added later. A lookup table and a foreign key is probably > better for this sort of use. I remember the discussions before PG implemented ENUMs at all - some people voted against this "feature" because they knew that questions about modifing the enum values would pop up sooner or later. > You *can* add elements to an enum type - sort of - by creating a new > type and converting columns. It's ugly, though, and will be hard to get > right when the column of interest is referenced by foreign keys and such. If there's one request to modify a specific ENUM column, other requests will follow because the enum set is not as static as it seems in the first place. So i beg that the only advise should be "change the column to a foreign key construct". Converting columns to new data types is much more overhead anyway. Kind regards -- Andreas 'ads' Scherbaum German PostgreSQL User Group
Andreas 'ads' Scherbaum wrote: > On Tue, 22 Apr 2008 20:58:30 +0800 Craig Ringer wrote: >> D. Dante Lorenso wrote: >>> The problem is that once I create a column in my account table that >>> uses this 'account_type' datatype, I can't seem to change or add to it >>> any more. I want to add a new value or edit/delete an existing one. >>> How do you make changes to an ENUM datatype that is already in use? >> As far as I know ENUM is not well suited to uses where new enumeration >> members may be added later. A lookup table and a foreign key is probably >> better for this sort of use. > I remember the discussions before PG implemented ENUMs at all - some > people voted against this "feature" because they knew that questions > about modifing the enum values would pop up sooner or later. >> You *can* add elements to an enum type - sort of - by creating a new >> type and converting columns. It's ugly, though, and will be hard to get >> right when the column of interest is referenced by foreign keys and such. > If there's one request to modify a specific ENUM column, other requests > will follow because the enum set is not as static as it seems in the > first place. So i beg that the only advise should be "change the column > to a foreign key construct". Converting columns to new data types is > much more overhead anyway. So, the advice here is "don't use ENUM"? I was really hoping that it would be more efficient to not have to do all the foreign keys and joins for tables that may have 4-5 enum types. Just being able to: SELECT * FROM tablename would be nice if my columns contained enums instead of doing: SELECT * FROM tablename, lookuptable WHERE tablename.some_id = lookuptable.some_id Isn't the join more expensive? -- Dante
On Apr 22, 2008, at 1:45 PM, D. Dante Lorenso wrote: > So, the advice here is "don't use ENUM"? I think it's more "Don't use ENUM for a type that you are planning to extend."
On Tue, Apr 22, 2008 at 3:31 PM, D. Dante Lorenso <dante@lorenso.com> wrote:
I agree with others that ENUMs stop looking pretty when you need to modify them...
Here's a thread from recent past where this exact problem was discussed... maybe it'll interest you...
http://archives.postgresql.org/pgsql-general/2007-12/msg01363.php
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
Mail sent from my BlackLaptop device
All,
In the past I have used foreign keys to lookup tables for small lists of values that I now think ENUM could do the job of. I was hoping that by using ENUM, I would avoid having to do joins in my queries, and that I'd be making the data more robust and faster.
I used to have a table for account_status:
A | Active
B | Billing Failed
C | Closed
D | Deactivated
account.acct_type CHAR references account_type.acct_type CHAR
But, now I've converted that to an ENUM:
ACTIVE
BILLING_FAILED
CLOSED
DEACTIVATED
account.acct_type ENUM account_type
The problem is that once I create a column in my account table that uses this 'account_type' datatype, I can't seem to change or add to it any more. I want to add a new value or edit/delete an existing one.
How do you make changes to an ENUM datatype that is already in use?
I agree with others that ENUMs stop looking pretty when you need to modify them...
Here's a thread from recent past where this exact problem was discussed... maybe it'll interest you...
http://archives.postgresql.org/pgsql-general/2007-12/msg01363.php
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
Mail sent from my BlackLaptop device
On Tue, Apr 22, 2008 at 2:45 PM, D. Dante Lorenso <dante@lorenso.com> wrote: > > So, the advice here is "don't use ENUM"? > > I was really hoping that it would be more efficient to not have to do all > the foreign keys and joins for tables that may have 4-5 enum types. > > Just being able to: > > SELECT * > FROM tablename If you use a "lookup table" methodology you still get that. Try this: smarlowe=# create table choices (color text primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "choices_pkey" for table "choices" CREATE TABLE smarlowe=# insert into choices values ('red'),('yellow'),('green'),('blue'); INSERT 0 4 smarlowe=# create table mystuff (id serial primary key, usenam text, mycolor text references choices(color)); NOTICE: CREATE TABLE will create implicit sequence "mystuff_id_seq" for serial column "mystuff.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "mystuff_pkey" for table "mystuff" CREATE TABLE smarlowe=# insert into mystuff (usenam, mycolor) values ('scott','red'); INSERT 0 1 smarlowe=# insert into mystuff (usenam, mycolor) values ('darren','blue'); INSERT 0 1 smarlowe=# insert into mystuff (usenam, mycolor) values ('dan','green'); INSERT 0 1 smarlowe=# insert into mystuff (usenam, mycolor) values ('steve','green'); INSERT 0 1 smarlowe=# insert into mystuff (usenam, mycolor) values ('mike','black'); ERROR: insert or update on table "mystuff" violates foreign key constraint "mystuff_mycolor_fkey" DETAIL: Key (mycolor)=(black) is not present in table "choices". smarlowe=# select * from mystuff; id | usenam | mycolor ----+--------+--------- 1 | scott | red 2 | darren | blue 3 | dan | green 4 | steve | green (4 rows) tada! No enum, and no join. But you can't insert illegal values in mycolor...
On Tue, 22 Apr 2008 15:45:39 -0500 "D. Dante Lorenso" <dante@lorenso.com> wrote: > I was really hoping that it would be more efficient to not have to do > all the foreign keys and joins for tables that may have 4-5 enum > types. > > Just being able to: > > SELECT * > FROM tablename > > would be nice if my columns contained enums instead of doing: > > SELECT * > FROM tablename, lookuptable > WHERE tablename.some_id = lookuptable.some_id > > Isn't the join more expensive? You were using natural keys, the join would not be required. Joshua D. Drake > > -- Dante > -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Scott Marlowe wrote: > On Tue, Apr 22, 2008 at 2:45 PM, D. Dante Lorenso <dante@lorenso.com> wrote: >> So, the advice here is "don't use ENUM"? >> I was really hoping that it would be more efficient to not have to do all >> the foreign keys and joins for tables that may have 4-5 enum types. >> Just being able to: >> SELECT * >> FROM tablename > If you use a "lookup table" methodology you still get that. Try this: > smarlowe=# create table choices (color text primary key); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "choices_pkey" for table "choices" > CREATE TABLE > smarlowe=# insert into choices values ('red'),('yellow'),('green'),('blue'); > INSERT 0 4 > smarlowe=# create table mystuff (id serial primary key, usenam text, > mycolor text references choices(color)); > NOTICE: CREATE TABLE will create implicit sequence "mystuff_id_seq" > for serial column "mystuff.id" > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "mystuff_pkey" for table "mystuff" > CREATE TABLE > smarlowe=# insert into mystuff (usenam, mycolor) values ('scott','red'); > INSERT 0 1 > smarlowe=# insert into mystuff (usenam, mycolor) values ('darren','blue'); > INSERT 0 1 > smarlowe=# insert into mystuff (usenam, mycolor) values ('dan','green'); > INSERT 0 1 > smarlowe=# insert into mystuff (usenam, mycolor) values ('steve','green'); > INSERT 0 1 > smarlowe=# insert into mystuff (usenam, mycolor) values ('mike','black'); > ERROR: insert or update on table "mystuff" violates foreign key > constraint "mystuff_mycolor_fkey" > DETAIL: Key (mycolor)=(black) is not present in table "choices". > smarlowe=# select * from mystuff; > id | usenam | mycolor > ----+--------+--------- > 1 | scott | red > 2 | darren | blue > 3 | dan | green > 4 | steve | green > (4 rows) > tada! No enum, and no join. But you can't insert illegal values in mycolor... This approach is so old-school, I seem to have overlooked the obvious. Here you've elected to use the foreign key to just control the possible values inserted but not really to look up the value. Seems you are storing the values in text form which goes against all the normalization techniques I've learned in school. I see this might be a problem with storage since you will need to store the TEXT value for every row in the 'mystuff' table instead of just storing the reference to the lookup table as an INTEGER. Over millions of rows, perhaps this would become a concern? What is the general consensus by the community about this approach? Is this de-normalization frowned upon, or is there a performance advantage here that warrants the usage? -- Dante >
On Tue, 2008-04-22 at 16:34 -0500, D. Dante Lorenso wrote: > I see this might be a > problem with storage since you will need to store the TEXT value for > every row in the 'mystuff' table instead of just storing the reference > to the lookup table as an INTEGER. Over millions of rows, perhaps this > would become a concern? It does use additional storage to store the full text value, rather than a fixed-size integer. However, the difference is not much when the average string length is short. If you store an integer reference instead, joins are not necessarily expensive. If the number of distinct values is small (which is the normal use case for ENUM), I would expect the joins to be quite cheap. Beware of running into bad plans however, or making the optimizer work too hard (if you have a lot of other joins, too). I don't think the performance concerns are major, but worth considering if you have millions of rows. > What is the general consensus by the community about this approach? Is > this de-normalization frowned upon, or is there a performance advantage > here that warrants the usage? This is not de-normalization, at all. Normalization is a formal process, and if this were de-normalization, you could find a specific rule that is violated by this approach. Look here: http://en.wikipedia.org/wiki/Third_normal_form If you go to higher normal forms, you will not find any violations there, either. There is nothing about normalization that requires the use of surrogate keys. The approach suggested by Scott Marlowe is normalized as well as being quite natural and simple. I think often this is overlooked as being "too simple", but it's a quite good design in many cases. Regards, Jeff Davis
On Tue, Apr 22, 2008 at 6:11 PM, Jeff Davis <pgsql@j-davis.com> wrote: > If you store an integer reference instead, joins are not necessarily > expensive. If the number of distinct values is small (which is the > normal use case for ENUM), I would expect the joins to be quite cheap. > Beware of running into bad plans however, or making the optimizer work > too hard (if you have a lot of other joins, too). Necessarily being the operative word here. Think about an enum as part of a composite key for example. It's a lot nicer to rely on enum for natural ordering than doing something like a functional index. Anyways, it's pretty easy to extend an enum...you can manually insert an entry into pg_enum (see the relevent docs). Just watch out for oid overlap. One thing currently that is very difficult currently to do is to alter the order of the enum elements. The current state of things is pretty workable though. Scott's color/mystuff example is generally preferred for a lot of cases. I _really_ prefer this to surrogate style enums where you have color_id...this approach makes your database unreadable IMO. A decent hybrid approach which I have been using lately is "char" (not char) where the choices set is reasonably small, well represented by a single character, and the intrinsic ordering property is not too important (where an enum might be better). In many cases though, the pure natural approach is simply the best. The enum though with is intrinsic ordering and more efficient indexing has an important niche however. merlin
On Tue, 22 Apr 2008 15:45:39 -0500 D. Dante Lorenso wrote: > Andreas 'ads' Scherbaum wrote: > > So, the advice here is "don't use ENUM"? Yes. You should/can use ENUM for something like 'gender': male, female, unknown. You don't need to add other values ever (yeah, i skipped some special cases). But if you use ENUM for color names (as example), it's easy to imagine that someone comes around and requests a new color to be added to the list. Here you should use a lookup table and a foreign key instead. It always depends on the situation. The real problem is that people start creating the table with ENUM and "oh, nice, this makes it easy" in mind. But usually things are not as easy as it looks on first sight, so they later run into trouble. Kind regards -- Andreas 'ads' Scherbaum German PostgreSQL User Group
On Wed, Apr 23, 2008 at 12:38:48PM +0200, Andreas 'ads' Scherbaum wrote: > Yes. You should/can use ENUM for something like 'gender': > male, female, unknown. You don't need to add other values ever (yeah, i > skipped some special cases). I was gonna say ! :-) Add hermaphrodite transgender with female phenotype transgender with male phenotype and you should be set from current medical science's point of view ;-) Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Wednesday 23 April 2008 14:10, Karsten Hilbert wrote: > On Wed, Apr 23, 2008 at 12:38:48PM +0200, Andreas 'ads' Scherbaum wrote: > > Yes. You should/can use ENUM for something like 'gender': > > male, female, unknown. You don't need to add other values ever (yeah, i > > skipped some special cases). > > I was gonna say ! :-) > > Add > > hermaphrodite > transgender with female phenotype > transgender with male phenotype > > and you should be set from current medical science's point > of view ;-) > The standard is unknown, male, female, and n/a. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Wed, Apr 23, 2008 at 2:51 PM, Robert Treat <xzilla@users.sourceforge.net> wrote: > On Wednesday 23 April 2008 14:10, Karsten Hilbert wrote: > > On Wed, Apr 23, 2008 at 12:38:48PM +0200, Andreas 'ads' Scherbaum wrote: > > > Yes. You should/can use ENUM for something like 'gender': > > > male, female, unknown. You don't need to add other values ever (yeah, i > > > skipped some special cases). > > > > I was gonna say ! :-) > > > > Add > > > > hermaphrodite > > transgender with female phenotype > > transgender with male phenotype > > > > and you should be set from current medical science's point > > of view ;-) > > > > The standard is unknown, male, female, and n/a. Both unknown and n/a sounds like NULL to me. -- Rob Wultsch wultsch@gmail.com wultsch (aim)
On Wed, Apr 23, 2008 at 12:10 PM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: > On Wed, Apr 23, 2008 at 12:38:48PM +0200, Andreas 'ads' Scherbaum wrote: > > > Yes. You should/can use ENUM for something like 'gender': > > male, female, unknown. You don't need to add other values ever (yeah, i > > skipped some special cases). > I was gonna say ! :-) > > Add > > hermaphrodite > transgender with female phenotype > transgender with male phenotype > > and you should be set from current medical science's point > of view ;-) Actually, hermaphrodite specifies complete male and female genitalia, which is impossible in humans. While various forms of address are available for people born with parts of both male and female genitalia, the most common and easiest is intersex. The folks here: http://www.isna.org/faq/ have a lot more to say about it, and seeing as how many of them ARE intersex, I'd leave it up to them. While most transgender folks prefer to be referred to as their assumed gender, there are some gender queer folks who prefer other forms of address. I would put it that gender is not so easily defined, which makes it a poor choice for enum.
Scott Marlowe wrote: > On Wed, Apr 23, 2008 at 12:10 PM, Karsten Hilbert > <Karsten.Hilbert@gmx.net> wrote: >> On Wed, Apr 23, 2008 at 12:38:48PM +0200, Andreas 'ads' Scherbaum wrote: >> >> > Yes. You should/can use ENUM for something like 'gender': >> > male, female, unknown. You don't need to add other values ever (yeah, i >> > skipped some special cases). >> I was gonna say ! :-) >> >> Add >> >> hermaphrodite >> transgender with female phenotype >> transgender with male phenotype >> >> and you should be set from current medical science's point >> of view ;-) > > Actually, hermaphrodite specifies complete male and female genitalia, > which is impossible in humans. While various forms of address are > available for people born with parts of both male and female > genitalia, the most common and easiest is intersex. > > The folks here: http://www.isna.org/faq/ have a lot more to say about > it, and seeing as how many of them ARE intersex, I'd leave it up to > them. Scott, there's absolutely nothing in that FAQ about their database, let alone whether or not they use ENUMs. (heh) > While most transgender folks prefer to be referred to as their assumed > gender, there are some gender queer folks who prefer other forms of > address. > > I would put it that gender is not so easily defined, which makes it a > poor choice for enum. > Absolutely true. Which is odd, because this example is trotted out whenever there's a thread about ENUMs.
On Wed, 23 Apr 2008 17:18:12 -0600 Scott Marlowe wrote: > I would put it that gender is not so easily defined, which makes it a > poor choice for enum. That's why my original statement had the additional note about special cases. If you write an address book you normally don't want to add information like transgender - just 'male', 'female', 'unknown' seems enough here. If you define a medical database, you have to extend the ENUM values a bit, but even then you know the possible values in advance before creating the ENUM - just that you have some more choices than in your address book. It all depends on the use case, just like ENUM itself. Kind regards -- Andreas 'ads' Scherbaum German PostgreSQL User Group
On Wed, Apr 23, 2008 at 07:46:07PM -0400, brian wrote: > > Absolutely true. Which is odd, because this example is trotted out > whenever there's a thread about ENUMs. I don't think it's odd at all. In my view, the people who think enums are a good datatype for databases are exactly the sorts who'd think that their data is as static as this poor understanding of the vagaries of individuals' sex (gender is a different problem, given its association with social roles) would suggest. The world moves around in unexpected ways. Your data model needs to accommodate itself to the world, because the alternative is not going to happen. A
On Thu, Apr 24, 2008 at 9:39 AM, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Wed, Apr 23, 2008 at 07:46:07PM -0400, brian wrote: > > > > Absolutely true. Which is odd, because this example is trotted out > > whenever there's a thread about ENUMs. > > I don't think it's odd at all. In my view, the people who think enums are a > good datatype for databases are exactly the sorts who'd think that their > data is as static as this poor understanding of the vagaries of individuals' > sex (gender is a different problem, given its association with social roles) > would suggest. > > The world moves around in unexpected ways. Your data model needs to > accommodate itself to the world, because the alternative is not going to > happen. Mostly true, but there are systems where certain parts really do tend to be static over very long periods, and for those, I might admit to ENUM being an answer. for instance, in the USAF, all units being worked on in our avionics shop were in one of three categories, Awaiting Maintenance, Awaiting Parts, or In Work. They were abbreviated AWM, AWP, and INW. That was back in the 1980s. I'm willing to bet it hasn't changed since then. For those types of problems, enums make a certain amount of sense, especially if you're tracking thousands of line units being worked on every hour of every day across the US. the small saving in space adds up fast. But that's an artificially constructed set of choices.
Andrew Sullivan wrote: > On Wed, Apr 23, 2008 at 07:46:07PM -0400, brian wrote: >> Absolutely true. Which is odd, because this example is trotted out >> whenever there's a thread about ENUMs. > I don't think it's odd at all. In my view, the people who think enums are a > good datatype for databases are exactly the sorts who'd think that their > data is as static as this poor understanding of the vagaries of individuals' > sex (gender is a different problem, given its association with social roles) > would suggest. Or, here's another way to look at it ... make it easier to modify ENUM datatypes because we all know that you will eventually need that feature whether you males, females, and unknowns think so or not. -- Dante
D. Dante Lorenso wrote: > Andrew Sullivan wrote: >> I don't think it's odd at all. In my view, the people who think >> enums are a >> good datatype for databases are exactly the sorts who'd think that their >> data is as static as this poor understanding of the vagaries of >> individuals' >> sex (gender is a different problem, given its association with social >> roles) >> would suggest. > > Or, here's another way to look at it ... make it easier to modify ENUM > datatypes because we all know that you will eventually need that > feature whether you males, females, and unknowns think so or not. +1
D. Dante Lorenso wrote: > Or, here's another way to look at it ... make it easier to modify ENUM > datatypes because we all know that you will eventually need that feature > whether you males, females, and unknowns think so or not. Agreed. Let's keep in mind that the current ENUM implementation is completely new. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Matthew T. O'Connor wrote: > D. Dante Lorenso wrote: > > Andrew Sullivan wrote: > >> I don't think it's odd at all. In my view, the people who think > >> enums are a > >> good datatype for databases are exactly the sorts who'd think that their > >> data is as static as this poor understanding of the vagaries of > >> individuals' > >> sex (gender is a different problem, given its association with social > >> roles) > >> would suggest. > > > > Or, here's another way to look at it ... make it easier to modify ENUM > > datatypes because we all know that you will eventually need that > > feature whether you males, females, and unknowns think so or not. > > +1 Added to TODO: * Allow adding enumerated values to an existing enumerated data type -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, Apr 24, 2008 at 11:04:10AM -0500, D. Dante Lorenso wrote: > > Or, here's another way to look at it ... make it easier to modify ENUM > datatypes because we all know that you will eventually need that feature > whether you males, females, and unknowns think so or not. Well, heck, why don't you just store your data in one infinitely wide table with 10,000 nullable fields? Then you can complain about the storage or performance implications. If you are devoted to an enum datatype, then you don't want to normalise your data further. With an incredibly small number of exceptions, I'd argue that means one of two things: either it doesn't need normalisation, in which case a relational database is the wrong tool for this job; or else it does need normalisation, but you don't understand how relational databases work well enough to do it properly. [Later: on re-reading this paragraph, I thought of a third possibility: that you're violating the first rule of optimisation.] The first time I encountered them, I thought enums were a filthy, ill-conceived answer to a problem that didn't exist, implemented by people who didn't understand relational databases. With considerably more experience under my belt than then, I say now that my original estimation was too kind. I do see what someone says upthread: if you have a special case where the storage saving adequately offsets the maintenance cost, this might be worth it. But if you've really analysed your case that carefully, and understand the costs (i.e. you won't carp later when changing is expensive) you already know what you're doing. Otherwise, don't use enums. They should be marked (like char(), IMO) in the manual as, "Warning: you probably don't want to use this datatype. Go think some more." A
On Thu, Apr 24, 2008 at 10:22 AM, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > > Otherwise, don't use enums. They should be marked (like char(), IMO) in the > manual as, "Warning: you probably don't want to use this datatype. Go think > some more." Good point. I think enums are kind like arrays. Most of the time they're a bad idea. sometimes they're an ok idea, and very very rarely, they are exactly what you need to solve a certain problem.
On Thu, Apr 24, 2008 at 12:22 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > The first time I encountered them, I thought enums were a filthy, > ill-conceived answer to a problem that didn't exist, implemented by people > who didn't understand relational databases. With considerably more > experience under my belt than then, I say now that my original estimation > was too kind. I think you're being a little too hard on enums here. I was actually in the anti-enum camp until it was demonstrated to me (and in my own testing) that using enum for natural ordering vs. fielding the ordering of the type out to a join is can be a huge win in such cases where it is important. Relational theory is all well and good, but in practical terms things like record size, index size, and query performance are important. I'll admit that if computers were infinitely fast, I'd probably use enums less, although I still like them for things like male/female. I think they have a place. The difficulty of manipulating enums is simply a shortcoming of the implementation which could presumably be solved at some later point. merlin
Merlin Moncure wrote: > On Thu, Apr 24, 2008 at 12:22 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote: >> The first time I encountered them, I thought enums were a filthy, >> ill-conceived answer to a problem that didn't exist, implemented by people >> who didn't understand relational databases. With considerably more >> experience under my belt than then, I say now that my original estimation >> was too kind. > > I think you're being a little too hard on enums here. I was actually > in the anti-enum camp until it was demonstrated to me (and in my own > testing) that using enum for natural ordering vs. fielding the > ordering of the type out to a join is can be a huge win in such cases > where it is important. Relational theory is all well and good, but in > practical terms things like record size, index size, and query > performance are important. Uhm. Sorry what? Can you demonstrate this particular use? When I first saw discussion about enumns I kinda hoped they will be implemented as kind of macro to really map to a table. But here you go. I'm still looking for a good example to demonstrate the usefullness of enums (same for arrays for that matter) Cheers Tino
ajs@crankycanuck.ca (Andrew Sullivan) writes: > On Wed, Apr 23, 2008 at 07:46:07PM -0400, brian wrote: >> >> Absolutely true. Which is odd, because this example is trotted out >> whenever there's a thread about ENUMs. > > I don't think it's odd at all. In my view, the people who think enums are a > good datatype for databases are exactly the sorts who'd think that their > data is as static as this poor understanding of the vagaries of individuals' > sex (gender is a different problem, given its association with social roles) > would suggest. > > The world moves around in unexpected ways. Your data model needs to > accommodate itself to the world, because the alternative is not going to > happen. By the same token, a limited model, at least in this area, frequently is sufficient to cover the set of ways in which legal systems recognize and consider gender when constructing legislation. And it's not evident that the simplification is a dramatic oversimplification that causes a great deal of legal failures to the extent to which it mandates that every system *needs* to track sex in a more detailed fashion than (male, female, unknown, n/a). I'm not sure, for instance, that I actually know what the word "phenotype" means, and there's reason to imagine I might be somewhat more "generally literate" than average. If I'm not sure, there seems little reason to expect that people with varying levels of comprehension necessarily be able to choose from a more elaborate set of options with accuracy. After spending literal billions of dollars on security efforts, it doesn't appear that security infrastructures that have *enormous* incentive have had much luck successfully identifying who is a terrorist and who is not; they consistently have extraordinary levels of "Type II" (false positive) reporting errors. -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://cbbrowne.com/info/linuxdistributions.html "There is no reason anyone would want a computer in their home". -- Ken Olson, Pres. and founder of Digital Equipment Corp. 1977
Bruce Momjian wrote: > Matthew T. O'Connor wrote: >> D. Dante Lorenso wrote: >>> Or, here's another way to look at it ... make it easier to modify ENUM >>> datatypes because we all know that you will eventually need that >>> feature whether you males, females, and unknowns think so or not. >> +1 > > Added to TODO: > > * Allow adding enumerated values to an existing enumerated data > type And removing values if possible (fail if values are in use?).
Alvaro Herrera wrote: > D. Dante Lorenso wrote: >> Or, here's another way to look at it ... make it easier to modify ENUM >> datatypes because we all know that you will eventually need that feature >> whether you males, females, and unknowns think so or not. > Agreed. Let's keep in mind that the current ENUM implementation is > completely new. Here's a use case that I doubt too many are aware of ... what's unique about ENUM is that the data for the ENUM becomes part of the schema of a database. This is actually something I like very much. I have systems where the DEV and PROD databases are separate and must be synchronized when pushing out new features. I currently use non-free tools to synchronize just the schemas for both databases. Often if a new row is added to a lookup table, that row doesn't make it to the QA or PROD databases because it's part of the data of a table and is not part of the schema. For data (like ENUM data) that should be consistent across databases, it helps if it gets deployed with the schema so that lookups will succeed properly. -- Dante
Matthew T. O'Connor wrote: > Bruce Momjian wrote: > > Matthew T. O'Connor wrote: > >> D. Dante Lorenso wrote: > >>> Or, here's another way to look at it ... make it easier to modify ENUM > >>> datatypes because we all know that you will eventually need that > >>> feature whether you males, females, and unknowns think so or not. > >> +1 > > > > Added to TODO: > > > > * Allow adding enumerated values to an existing enumerated data > > type > > And removing values if possible (fail if values are in use?). Yep, updated: * Allow adding/removing enumerated values to an existing enumerated data -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Yep, updated: > * Allow adding/removing enumerated values to an existing enumerated > data Renaming an existing value might be interesting too (and would be far easier than either of the above). regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Yep, updated: > > * Allow adding/removing enumerated values to an existing enumerated > > data > > Renaming an existing value might be interesting too (and would be far > easier than either of the above). TODO updated: * Allow adding/renaming/removing enumerated values to an existing enumerated data type -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, 24 Apr 2008 14:02:07 -0400 Merlin Moncure wrote: > On Thu, Apr 24, 2008 at 12:22 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > > The first time I encountered them, I thought enums were a filthy, > > ill-conceived answer to a problem that didn't exist, implemented by people > > who didn't understand relational databases. With considerably more > > experience under my belt than then, I say now that my original estimation > > was too kind. > > I think you're being a little too hard on enums here. No, i don't think, Andrew is too hard here. As said before: in some special cases enum makes sense, like you said: > I was actually in the anti-enum camp until it was demonstrated to me > (and in my own testing) that using enum for natural ordering vs. fielding the > ordering of the type out to a join is can be a huge win in such cases > where it is important. But the problem are really not this special cases, the real problem are all the people who don't understand about enum and just try to use it because "it seems to fit best" - unless they run into problems. So we have a good tool in your hands, we just have to tell/teach the people, how to use it and especially when not to use it. Kind regards -- Andreas 'ads' Scherbaum German PostgreSQL User Group
On Wed, Apr 23, 2008 at 05:51:48PM -0400, Robert Treat wrote: > > Add > > > > hermaphrodite > > transgender with female phenotype > > transgender with male phenotype > > > > and you should be set from current medical science's point > > of view ;-) > > > > The standard is unknown, male, female, and n/a. Apparently the standard doesn't care about reality. But it all depends on the circumstantial needs. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Wed, Apr 23, 2008 at 05:18:12PM -0600, Scott Marlowe wrote: > > hermaphrodite > > transgender with female phenotype > > transgender with male phenotype > the most common and easiest is intersex. The political correctness of any one term changes over time. The above list should close the circle if phenotype and genotype are taken into account: male (g: m, p: m) female (g: f, p: m) hermaphrodite or intersex (g: f+m, p: f+m) (note this also includes mosaic -- extremely rare) transgender f pheno (g: f+m, p: f) transgender m pheno (g: f+m, p: f) > I would put it that gender is not so easily defined, which makes it a > poor choice for enum. That surely is the last word on it also from my point of view. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Wed, Apr 23, 2008 at 07:46:07PM -0400, brian wrote: >> I would put it that gender is not so easily defined, which makes it a >> poor choice for enum. > > Absolutely true. Which is odd, because this example is trotted out > whenever there's a thread about ENUMs. So it's good we got it in the archives now :-) Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Thu, Apr 24, 2008 at 10:25:44AM +0200, Andreas 'ads' Scherbaum wrote: > If you define a medical database, you have to extend the ENUM values a > bit, but even then you know the possible values in advance Considering scary genetic experiments I wouldn't even be sure about that. > It all depends on the use case, just like ENUM itself. Absolutely. Maybe another outcome of this discussion might be that PostgreSQL documentation - as thorough as the whole project is set up - should never use gender as an example for ENUM (it doesn't now in 8.3 docs btw). Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 24/apr/08, at 22:15, Matthew T. O'Connor wrote: > Bruce Momjian wrote: >> Matthew T. O'Connor wrote: >>> D. Dante Lorenso wrote: >>>> Or, here's another way to look at it ... make it easier to >>>> modify ENUM datatypes because we all know that you will >>>> eventually need that feature whether you males, females, and >>>> unknowns think so or not. >>> +1 >> Added to TODO: >> * Allow adding enumerated values to an existing enumerated data >> type > > And removing values if possible (fail if values are in use?). And reorder them, too. -- Giorgio Valoti
D. Dante Lorenso wrote: > Alvaro Herrera wrote: >> D. Dante Lorenso wrote: >>> Or, here's another way to look at it ... make it easier to modify >>> ENUM datatypes because we all know that you will eventually need >>> that feature whether you males, females, and unknowns think so or not. >> Agreed. Let's keep in mind that the current ENUM implementation is >> completely new. > > Here's a use case that I doubt too many are aware of ... what's unique > about ENUM is that the data for the ENUM becomes part of the schema of a > database. > > This is actually something I like very much. I have systems where the > DEV and PROD databases are separate and must be synchronized when > pushing out new features. I currently use non-free tools to synchronize > just the schemas for both databases. > > Often if a new row is added to a lookup table, that row doesn't make it > to the QA or PROD databases because it's part of the data of a table and > is not part of the schema. For data (like ENUM data) that should be > consistent across databases, it helps if it gets deployed with the > schema so that lookups will succeed properly. Well since its configuration and not payload its nothing wrong with just having the data in your repository as well and load it every time when you roll out a new release. Cheers Tino
On Apr 25, 2008, at 4:49 AM, Giorgio Valoti wrote: > And reorder them, too. Why would you want to reorder an enum? It seems to me the point of them is to hold a small list of valid values. The order the list is described in surely should be irrelevant?
On Apr 25, 2008, at 8:19 AM, Ben Chobot wrote: > > On Apr 25, 2008, at 4:49 AM, Giorgio Valoti wrote: > >> And reorder them, too. > > Why would you want to reorder an enum? It seems to me the point of > them is to hold a small list of valid values. The order the list is > described in surely should be irrelevant? One of the main charms of using an enum for some things is that it has a natural ordering. Critical > Urgent > Normal > Low > Marginal It's not the only reason to use an enum, but for me it'd be the most compelling. Cheers, Steve
Sync some database tables, but not others ... WAS Re: How to modify ENUM datatypes?
From
"D. Dante Lorenso"
Date:
Tino Wildenhain wrote: > D. Dante Lorenso wrote: >> Alvaro Herrera wrote: >>> D. Dante Lorenso wrote: >>>> Or, here's another way to look at it ... make it easier to modify >>>> ENUM datatypes because we all know that you will eventually need >>>> that feature whether you males, females, and unknowns think so or not. >>> Agreed. Let's keep in mind that the current ENUM implementation is >>> completely new. >> >> Here's a use case that I doubt too many are aware of ... what's unique >> about ENUM is that the data for the ENUM becomes part of the schema of >> a database. >> >> This is actually something I like very much. I have systems where the >> DEV and PROD databases are separate and must be synchronized when >> pushing out new features. I currently use non-free tools to >> synchronize just the schemas for both databases. >> >> Often if a new row is added to a lookup table, that row doesn't make >> it to the QA or PROD databases because it's part of the data of a >> table and is not part of the schema. For data (like ENUM data) that >> should be consistent across databases, it helps if it gets deployed >> with the schema so that lookups will succeed properly. > > Well since its configuration and not payload its nothing wrong with just > having the data in your repository as well and load it every time when > you roll out a new release. I have a convenient 3rd party tool that will "sync this database schema with that database schema". I just run the tool, accept the discovered changes and voila, I've deployed the database changes to the next environment. I haven't written any custom scripts to import content into specific tables. As I see it, that would be a little complex also since you'd have to find the difference between dev and prod and only push the changes across (additions or deletes). For potentially hundreds of small ENUM-like lookup tables, this seems quite tedious ... like the kind of thing a DB sync tool should handle for you ;-) Perhaps there is a 3rd party tool that would not only sync the DB schema, but could add a list of tables which must also have their data synced? Something that could remember that these 50 tables are "constant/deployable" and must be mirrored as-is to the other database while these other tables store environment-specific data and should not be synced. Anyone know of such a tool? -- Dante > > Cheers > Tino >
On Thu, Apr 24, 2008 at 3:01 PM, Tino Wildenhain <tino@wildenhain.de> wrote: > Merlin Moncure wrote: > > I think you're being a little too hard on enums here. I was actually > > in the anti-enum camp until it was demonstrated to me (and in my own > > testing) that using enum for natural ordering vs. fielding the > > ordering of the type out to a join is can be a huge win in such cases > > where it is important. Relational theory is all well and good, but in > > practical terms things like record size, index size, and query > > performance are important. > > > > Uhm. Sorry what? Can you demonstrate this particular use? > When I first saw discussion about enumns I kinda hoped they > will be implemented as kind of macro to really map to a table. > But here you go. I'm still looking for a good example to > demonstrate the usefullness of enums (same for arrays for that > matter) You must not be aware that enums are naturally ordered to make that statement. Suppose your application needs to order a large table by a,b,c where b is the an 'enum' type of data. With an enum, the order is inlined into the key order, otherwise it's out of line, meaning your you key is larger (enum is 4 bytes, varchar is guaranteed to be larger), and you need to join out to get the ordering position, use a functional index, or cache it in the main table. I agree with disagree with you on arrays. I think they are generally a bad idea in terms of using them as a column type. However they are useful passing data to/from functions and back/forth from the client. merlin
On Friday 25 April 2008 14:56, Merlin Moncure wrote: > On Thu, Apr 24, 2008 at 3:01 PM, Tino Wildenhain <tino@wildenhain.de> wrote: > > Merlin Moncure wrote: > > > I think you're being a little too hard on enums here. I was actually > > > in the anti-enum camp until it was demonstrated to me (and in my own > > > testing) that using enum for natural ordering vs. fielding the > > > ordering of the type out to a join is can be a huge win in such cases > > > where it is important. Relational theory is all well and good, but in > > > practical terms things like record size, index size, and query > > > performance are important. > > > > Uhm. Sorry what? Can you demonstrate this particular use? > > When I first saw discussion about enumns I kinda hoped they > > will be implemented as kind of macro to really map to a table. > > But here you go. I'm still looking for a good example to > > demonstrate the usefullness of enums (same for arrays for that > > matter) > > You must not be aware that enums are naturally ordered to make that > statement. Suppose your application needs to order a large table by > a,b,c where b is the an 'enum' type of data. With an enum, the order > is inlined into the key order, otherwise it's out of line, meaning > your you key is larger (enum is 4 bytes, varchar is guaranteed to be > larger), and you need to join out to get the ordering position, use a > functional index, or cache it in the main table. > I think one of the best examples of this is the movie rating system (which I blogged about at http://people.planetpostgresql.org/xzilla/index.php?/archives/320-PostgreSQL-8.3-Features-Enum-Datatype.html ) It's a good example of setting pre-defined values that really can leverage the enum types custom ordering. It also showcases the idea of data definitions that "should never change", but that do changes every half dozen years or so. Now you can argue that since it is expected that the ratings might change in some way every few years that an enum type is not a good choice for this, but I feel like some type of counter-argument is that this is probably longer than one would expect thier database software to last. :-) -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Sat, Apr 26, 2008 at 08:33:28PM -0400, Robert Treat wrote: > enum types custom ordering. It also showcases the idea of data definitions > that "should never change", but that do changes every half dozen years or so. > Now you can argue that since it is expected that the ratings might change in > some way every few years that an enum type is not a good choice for this, but > I feel like some type of counter-argument is that this is probably longer > than one would expect thier database software to last. :-) I think that if you are building software on the premise that it's only going to last five years, you oughta have a look around on the Internet again. Or think about why banks spent the money they did a few years back poring over ancient code making sure that two-digit year representations weren't in use. You can _of course_ make this sort of trade-off: the cost of the upgrade might be worth the natural ordering and boost in performance. But that was part of my point when noting that enums oughta come with a warning (and why I compared them to char()). I'm not arguing that they're completely useless; just that, like any oddly specialised tool, they require careful use. A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
xzilla@users.sourceforge.net (Robert Treat) writes: > I feel like some type of counter-argument is that this is probably longer > than one would expect thier database software to last. :-) That has the counterargument that if the database software works, it's likely to get used for longer than one would expect. I don't think I have ever seen a case where DB-based software got replaced *earlier* than planned. -- output = reverse("ofni.sesabatadxunil" "@" "enworbbc") http://cbbrowne.com/info/emacs.html Rules of the Evil Overlord #172. "I will allow guards to operate under a flexible work schedule. That way if one is feeling sleepy, he can call for a replacement, punch out, take a nap, and come back refreshed and alert to finish out his shift. <http://www.eviloverlord.com/>
Chris Browne wrote: > xzilla@users.sourceforge.net (Robert Treat) writes: > >> I feel like some type of counter-argument is that this is probably longer >> than one would expect thier database software to last. :-) >> > > That has the counterargument that if the database software works, it's > likely to get used for longer than one would expect. > > I don't think I have ever seen a case where DB-based software got > replaced *earlier* than planned. > I have - but only where it only worked if you adopt a very limited definition of "worked". When management's paid for something and they want to deploy it despite warnings that it's just not going to do the job these things can happen, and a rush project to replace the system can arise when it becomes clear (to management) just how broken the system really is. I've been lucky enough not to have to directly experience such problems, but I know a couple of people who've been saddled with implementing, then rapidly replacing, monster-from-the-deep database systems. "VB6 and IBM UniVerse? Why not? ..." I've hacked together small web apps in a couple of days that've gone on to see six years (and ongoing) of heavy use at my work - because they do the job, and because I haven't had the time or cause to replace them with something cleaner. Every couple of years I have to go back and fix something when a mistake in the code bites me, but all in all they've worked amazingly well. Given that I'd have to agree that it's a good idea to assume your database software will live on well beyond your expectations, and it's worth considering how you'll feel when someone calls you in 5 years with an issue with some code you haven't touched since you wrote it as a throwaway tool. The call will, of course, be urgent, and involve either breakage that must be fixed in 10 minutes for some critical business process (that you didn't know they even used your tool for) to continue, or will be to inform you that you have one day to make major changes they could've told you about six weeks ago if they felt like it. So ... it's well worth considering the long term now. Unfortunately I speak from recent experience - and my beginner perl+MySQL code was NOT designed for long term flexibility and robustness. *shudder*. -- Craig Ringer
Hi, Merlin Moncure wrote: > On Thu, Apr 24, 2008 at 3:01 PM, Tino Wildenhain <tino@wildenhain.de> wrote: >> Merlin Moncure wrote: >>> I think you're being a little too hard on enums here. I was actually >>> in the anti-enum camp until it was demonstrated to me (and in my own >>> testing) that using enum for natural ordering vs. fielding the >>> ordering of the type out to a join is can be a huge win in such cases >>> where it is important. Relational theory is all well and good, but in >>> practical terms things like record size, index size, and query >>> performance are important. >>> >> Uhm. Sorry what? Can you demonstrate this particular use? >> When I first saw discussion about enumns I kinda hoped they >> will be implemented as kind of macro to really map to a table. >> But here you go. I'm still looking for a good example to >> demonstrate the usefullness of enums (same for arrays for that >> matter) > > You must not be aware that enums are naturally ordered to make that > statement. Suppose your application needs to order a large table by > a,b,c where b is the an 'enum' type of data. With an enum, the order > is inlined into the key order, otherwise it's out of line, meaning > your you key is larger (enum is 4 bytes, varchar is guaranteed to be > larger), and you need to join out to get the ordering position, use a > functional index, or cache it in the main table. I see, but couldn't you just use int in this case? And map only when you need the values for display (usually you want it localized anyway) > I agree with disagree with you on arrays. I think they are generally > a bad idea in terms of using them as a column type. However they are > useful passing data to/from functions and back/forth from the client. Yes of course, I thought of that (wondering why we can't use value expressions everywhere) Tino
On Sat, 2008-04-26 at 20:33 -0400, Robert Treat wrote: > I think one of the best examples of this is the movie rating system (which I > blogged about at > http://people.planetpostgresql.org/xzilla/index.php?/archives/320-PostgreSQL-8.3-Features-Enum-Datatype.html > ) > > It's a good example of setting pre-defined values that really can leverage the > enum types custom ordering. It also showcases the idea of data definitions > that "should never change", but that do changes every half dozen years or so. > Now you can argue that since it is expected that the ratings might change in > some way every few years that an enum type is not a good choice for this, but > I feel like some type of counter-argument is that this is probably longer > than one would expect thier database software to last. :-) > Let's say you have ratings A, B, and D for 5 years, and then you add rating C between B and D. If you have a constant stream of movies that must be reviewed, then the addition of a new rating will necessarily take some fraction of the movies away from at least one of the old ratings. In that case, is an old B really equal to a new B? Similar concerns apply to other changes in ENUMs, and for that matter, they apply to the FK design, as well. I would say the *actual* rating is the combination of the rating name, and the version of the standards under which it was rated. Regards, Jeff Davis
On Monday 28 April 2008 17:35, Jeff Davis wrote: > On Sat, 2008-04-26 at 20:33 -0400, Robert Treat wrote: > > I think one of the best examples of this is the movie rating system > > (which I blogged about at > > http://people.planetpostgresql.org/xzilla/index.php?/archives/320-Postgre > >SQL-8.3-Features-Enum-Datatype.html ) > > > > It's a good example of setting pre-defined values that really can > > leverage the enum types custom ordering. It also showcases the idea of > > data definitions that "should never change", but that do changes every > > half dozen years or so. Now you can argue that since it is expected that > > the ratings might change in some way every few years that an enum type is > > not a good choice for this, but I feel like some type of counter-argument > > is that this is probably longer than one would expect thier database > > software to last. :-) > > Let's say you have ratings A, B, and D for 5 years, and then you add > rating C between B and D. > > If you have a constant stream of movies that must be reviewed, then the > addition of a new rating will necessarily take some fraction of the > movies away from at least one of the old ratings. In that case, is an > old B really equal to a new B? > > Similar concerns apply to other changes in ENUMs, and for that matter, > they apply to the FK design, as well. > > I would say the *actual* rating is the combination of the rating name, > and the version of the standards under which it was rated. > *You* would say that, but typically movie ratings are not adjusted when a new rating comes out. For good examples of this, go back and look at 70's era movies (cowboy movies, war movies, etc...) that are G rated, but have a lot of people being shot/killed on-screen, something which would give you an automatic PG rating today. (There are similar issues with PG/R movies in the 80's, typically focused on violence and drug use, before the PG-13 rating came out). -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Monday 28 April 2008 10:28, Andrew Sullivan wrote: > On Sat, Apr 26, 2008 at 08:33:28PM -0400, Robert Treat wrote: > > enum types custom ordering. It also showcases the idea of data > > definitions that "should never change", but that do changes every half > > dozen years or so. Now you can argue that since it is expected that the > > ratings might change in some way every few years that an enum type is not > > a good choice for this, but I feel like some type of counter-argument is > > that this is probably longer than one would expect thier database > > software to last. :-) > > I think that if you are building software on the premise that it's > only going to last five years, you oughta have a look around on the > Internet again. Or think about why banks spent the money they did a > few years back poring over ancient code making sure that two-digit > year representations weren't in use. > If one were to have built something on postgresql 5 years ago, they would have had to do it on 7.3. Whenever anyone posts a problem on 7.3, the first thing people do now days is jump up and down waving thier arms about while exclaiming how quickly they should upgrade. While I am certain there are even older versions of postgres still running in production out there, I'd have to say that the core developers for this project do not release software with the expectation that you will use if for more than 5 years. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
Robert Treat wrote: > If one were to have built something on postgresql 5 years ago, they would have > had to do it on 7.3. Whenever anyone posts a problem on 7.3, the first thing > people do now days is jump up and down waving thier arms about while > exclaiming how quickly they should upgrade. [snip] > I'd have to > say that the core developers for this project do not release software > with the expectation that you will use if for more than 5 years. From what I've seen on the list so far upgrades are advised precisely because somebody has had to ask for help with a problem - one that's usually resolved in newer versions. Doubly so because the issues raised are usually performance related or are caused by limitations in 7.3 (or whatever they're running). If upgrading solves a problem that somebody is asking for help with, then advising the user to upgrade makes sense. Consider operating systems. An OS vendor might expect a given version to be used for five years or more, but if you ask them for help because you're encountering awful performance with 2TB disks or high CPU load on 10GigE networks they're quite likely to advise you to upgrade the OS to a version with enhancements that solve your problems. That says nothing about the people out there still using 7.3 and similar without problems, running well within its capabilities and happy with what it's doing. I doubt many people would advise them to upgrade - at least not in a hurry and not with any jumping and hand-waving. I often see responses along the lines of "if you're using 8.3 then just do <blah> otherwise you'll need to...." - so there doesn't appear to be any assumption that the first step must be to upgrade to the latest version. My impression from using PostgreSQL is that people using old versions are taken seriously. Data corruption, crash and security bug fixes get applied to very old versions. For example, 7.3.21 was released on Jan 2008, and includes several fixes: http://www.postgresql.org/docs/current/static/release-7-3-21.html Given that 7.3 was released in late 2002: http://www.postgresql.org/docs/current/static/release-7-3.html I think that's pretty good myself. Given that PostgreSQL upgrades aren't trivial, it's important to have these ongoing releases for older versions. It's great to see that need so well met.
On Wed, Apr 30, 2008 at 10:19:25AM -0400, Robert Treat wrote: > exclaiming how quickly they should upgrade. While I am certain there are even > older versions of postgres still running in production out there, I'd have to > say that the core developers for this project do not release software with > the expectation that you will use if for more than 5 years. You're equivocating here. The PostgreSQL project is under active development, and there are supported and unsupported versions. Moreover, it's not really just an application; it's more like application infrastructure. The database engine one uses for an application is more like a shared library, from the application's point of view, than it is like another application. Most application software doesn't remain under active development once delivered. You write it to the point where it works, and then when it is doing the job expected, it starts to be used. There is no maintenance, and there are no opportunities for new nifty features. This is _especially_ true of bespoke software, which is why most of it is so awful when you look at it: there's no incentive at all to make the code maintainable, because it will never get any maintenance anyway. Think of the difference between making durable goods like washing machines or automobiles, as compared to more or less disposable goods like kitchen gadgets or, these days, toasters and blenders. The former either come with or have available for them spare parts, schematic documents, and long warranties. The latter come with, if you're lucky, a warranty of a year and warnings that there are no servicable parts inside. Yet people expect to use their toasters and kitchen gadgets for years, and are annoyed if they fail after just a year. A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
On Wed, 30 Apr 2008, Robert Treat wrote: > Whenever anyone posts a problem on 7.3, the first thing people do now > days is jump up and down waving thier arms about while exclaiming how > quickly they should upgrade. While I am certain there are even older > versions of postgres still running in production out there, I'd have to > say that the core developers for this project do not release software > with the expectation that you will use if for more than 5 years. You could easily make a case that 7.3 wasn't quite mature enough overall to be useful for 5 years. There's little reason to keep pumping support effort into something with unfixable flaws. I know when I was using 7.4 heavily, I never felt like that was something I could keep going for that long; the VACUUM issues in particular really stuck out as something I wouldn't be likely to handle on future hardware having larger databases. 8.1, on the other hand, is the first release I thought you could base a long-term effort on, and 8.2 and 8.3 have moved further in that direction. 8.1 has been out for 2.5 years now, and it seems like it's got plenty of useful left in it still (except on Windows). The improvements in 8.2 and 8.3 are significant but not hugely important unless you're suffering performance issues. Compare with 7.3, which came out at the end of 2002. By 2.5 years after that, the project was well into 8.0, which was clearly a huge leap. PITR, tablespaces, whole new buffer strategy, these are really fundamental and compelling rather than the more incremental improvements coming out nowadays. (Obligatory Oracle comparison: for customers with standard support levels, Oracle 8.1 was EOL'd after slightly more than 4 years. It wasn't until V9 that they pushed that to 5 years) -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Wednesday 30 April 2008 11:00, Craig Ringer wrote: > Robert Treat wrote: > > If one were to have built something on postgresql 5 years ago, they would > > have had to do it on 7.3. Whenever anyone posts a problem on 7.3, the > > first thing people do now days is jump up and down waving thier arms > > about while exclaiming how quickly they should upgrade. > > [snip] > > > I'd have to > > say that the core developers for this project do not release software > > with the expectation that you will use if for more than 5 years. > <snip> > That says nothing about the people out there still using 7.3 and similar > without problems, running well within its capabilities and happy with > what it's doing. I doubt many people would advise them to upgrade - at > least not in a hurry and not with any jumping and hand-waving. > <snip> > My impression from using PostgreSQL is that people using old versions > are taken seriously. Data corruption, crash and security bug fixes get > applied to very old versions. For example, 7.3.21 was released on Jan > 2008, and includes several fixes: > > http://www.postgresql.org/docs/current/static/release-7-3-21.html > from those very release notes "This is expected to be the last PostgreSQL release in the 7.3.X series. Users are encouraged to update to a newer release branch soon." If you are on any version of 7.3, the official response is "you need to upgrade to a newer major version" regardless of your problems. You're overlooking data-loss level bugs that can bite people even if they aren't currently suffering from any issues. And again, if you do the math, any install before 2008-11-17 would have been on 7.3, which is less than 5 years. Or, looking forward, I'm not expecting 7.4 will be supported beyond 2010 (there have already been calls to stop supporting it for some time) which is what would be required if we really have an expectation of support for more than 5 years. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Thursday 01 May 2008 01:30, Greg Smith wrote: > On Wed, 30 Apr 2008, Robert Treat wrote: > > Whenever anyone posts a problem on 7.3, the first thing people do now > > days is jump up and down waving thier arms about while exclaiming how > > quickly they should upgrade. While I am certain there are even older > > versions of postgres still running in production out there, I'd have to > > say that the core developers for this project do not release software > > with the expectation that you will use if for more than 5 years. > > You could easily make a case that 7.3 wasn't quite mature enough overall > to be useful for 5 years. There's little reason to keep pumping support > effort into something with unfixable flaws. I know when I was using 7.4 > heavily, I never felt like that was something I could keep going for that > long; the VACUUM issues in particular really stuck out as something I > wouldn't be likely to handle on future hardware having larger databases. > > 8.1, on the other hand, is the first release I thought you could base a > long-term effort on, and 8.2 and 8.3 have moved further in that direction. > 8.1 has been out for 2.5 years now, and it seems like it's got plenty of > useful left in it still (except on Windows). The improvements in 8.2 and > 8.3 are significant but not hugely important unless you're suffering > performance issues. > > Compare with 7.3, which came out at the end of 2002. By 2.5 years after > that, the project was well into 8.0, which was clearly a huge leap. > PITR, tablespaces, whole new buffer strategy, these are really fundamental > and compelling rather than the more incremental improvements coming out > nowadays. > This all sounds nice, but I don't see any movement from the project to increase community commitment to 5 years for any release, so I think it's all moot. > (Obligatory Oracle comparison: for customers with standard support > levels, Oracle 8.1 was EOL'd after slightly more than 4 years. It wasn't > until V9 that they pushed that to 5 years) > And even that isn't full support. IIRC Oracle certified applications can only be done within the first 3 years of the product. I think there are other scenarios under 5 years as well. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
Robert Treat <xzilla@users.sourceforge.net> writes: > This all sounds nice, but I don't see any movement from the project to > increase community commitment to 5 years for any release, so I think it's all > moot. "Movement"? We did in fact support 7.3 for five years. Other than the special case of deciding to obsolete pre-8.2 Windows ports, I don't see anything on the horizon that would cause us to obsolete the current releases earlier. If anything, I foresee pressure to support the latest releases longer than that, since as Greg said, they are more credible long-term prospects than 7.x ever was. regards, tom lane
Robert Treat <xzilla@users.sourceforge.net> writes: > And again, if you do the math, any install before 2008-11-17 would have been > on 7.3, which is less than 5 years. I'm not sure how you're doing the math, but my copy of the release notes dates 7.3 as 2002-11-27 and 7.3.21 as 2008-01-07, which makes it five years plus that we provided bug-fix releases for 7.3. > Or, looking forward, I'm not expecting > 7.4 will be supported beyond 2010 (there have already been calls to stop > supporting it for some time) which is what would be required if we really > have an expectation of support for more than 5 years. 7.4 was released 2003-11-17, so I think that it will very likely get obsoleted at the end of 2008. regards, tom lane
On Thursday 01 May 2008 13:40, Tom Lane wrote: > Robert Treat <xzilla@users.sourceforge.net> writes: > > And again, if you do the math, any install before 2008-11-17 would have > > been on 7.3, which is less than 5 years. > > I'm not sure how you're doing the math, but my copy of the release notes > dates 7.3 as 2002-11-27 and 7.3.21 as 2008-01-07, which makes it five > years plus that we provided bug-fix releases for 7.3. > The whole thing started with "If I were to have installed postgres 5 years ago", which would be 2003-05-01, then I would not have gotten 5 years of support from that system. Essentially that statement is true of any install up to the 7.4 release. > > Or, looking forward, I'm not expecting > > 7.4 will be supported beyond 2010 (there have already been calls to stop > > supporting it for some time) which is what would be required if we really > > have an expectation of support for more than 5 years. > > 7.4 was released 2003-11-17, so I think that it will very likely get > obsoleted at the end of 2008. > If that's the case, it'd be nice to get an official statement of that now. :-) -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
Robert Treat <xzilla@users.sourceforge.net> writes: > On Thursday 01 May 2008 13:40, Tom Lane wrote: >> I'm not sure how you're doing the math, but my copy of the release notes >> dates 7.3 as 2002-11-27 and 7.3.21 as 2008-01-07, which makes it five >> years plus that we provided bug-fix releases for 7.3. > The whole thing started with "If I were to have installed postgres 5 years > ago", which would be 2003-05-01, then I would not have gotten 5 years of > support from that system. Essentially that statement is true of any install > up to the 7.4 release. I have never heard of anyone measuring product support lifespans from any point other than the original release date. If you want to define it in some random other fashion, that's your privilege, but it doesn't change how I'm going to think about it. regards, tom lane
xzilla@users.sourceforge.net (Robert Treat) writes: > On Thursday 01 May 2008 13:40, Tom Lane wrote: >> 7.4 was released 2003-11-17, so I think that it will very likely get >> obsoleted at the end of 2008. >> > If that's the case, it'd be nice to get an official statement of that now. :-) People have been making noises suggesting the idea already; I expect that the flurry of counterarguments will start diminishing at that point. -- let name="cbbrowne" and tld="linuxdatabases.info" in String.concat "@" [name;tld];; http://www3.sympatico.ca/cbbrowne/sap.html "My dog appears to require more PM than my car, although he also seems to be cheaper to service." -- GSB
On Thu, May 1, 2008 at 3:57 PM, Chris Browne <cbbrowne@acm.org> wrote: > xzilla@users.sourceforge.net (Robert Treat) writes: > > On Thursday 01 May 2008 13:40, Tom Lane wrote: > > >> 7.4 was released 2003-11-17, so I think that it will very likely get > >> obsoleted at the end of 2008. > >> > > If that's the case, it'd be nice to get an official statement of that now. :-) > > People have been making noises suggesting the idea already; I expect > that the flurry of counterarguments will start diminishing at that > point. Maybe I'm some crazy, radical DBA, but I've never had a version of pgsql get EOLed out from underneath me. I migrated from 7.4 to 8.1 right around the time 8.2 came out then upgraded to 8.2 around 6 months later. Where I work now we are looking at migrating from 8.1 to 8.2 or 8.3 (depending on whether or not we have the man power to fix a few issues with type coercion, our app, and 8.3) These aren't "the DBA got a wild hair and just had to upgrade" upgrades. Each time I've migrated it's been because there were performance or maintenance issues that were solved by upgrading. OTOH, a db I set up YEARS ago on 7.2 was still running last year I believe. they dump, initdb and reload it every year or two and it still works for what they designed the app on top of it to do.
> Maybe I'm some crazy, radical DBA, but I've never had a version of > pgsql get EOLed out from underneath me. I migrated from 7.4 to 8.1 > right around the time 8.2 came out then upgraded to 8.2 around 6 > months later. > > Where I work now we are looking at migrating from 8.1 to 8.2 or 8.3 > (depending on whether or not we have the man power to fix a few issues > with type coercion, our app, and 8.3) These aren't "the DBA got a > wild hair and just had to upgrade" upgrades. Each time I've migrated > it's been because there were performance or maintenance issues that > were solved by upgrading. Perhaps I'm in a unique situation as well, but as the DBA of a data-tank style DB, I haven't had a problem at all finding opportunities to upgrade to later versions of postgresql. My schema isn't all that complicated; it's just a very large amount of data and some very complex queries on that data- but the queries have been kept to extremely standard SQL specifically for migration and cross-platform reasons. It's definitely been annoying on occasion to find that I need to do a dump and restore to move to a new version, but at the same time cheap, large storage is extremely inexpensive when compared to the sort of storage acceptable for day-to-day use, so size isn't generally a problem- just dump to a big, cheap disk and then restore. I'm probably lucky in that I manage a shop that can tolerate a day's downtime for such a situation, but at the same time, we also demand the most from database performance for complex queries, so a day's downtime here could easily save many days' worth of query time down the line. 8.3, FWIW, was particularly attractive in this regard. I couldn't quite justify upgrading to the release candidates, but the performance improvements were pretty tempting. -- - David T. Wilson david.t.wilson@gmail.com
>> Maybe I'm some crazy, radical DBA, but I've never had a version of >> pgsql get EOLed out from underneath me. Just for fun, I did a bit of digging in the release notes http://developer.postgresql.org/pgdocs/postgres/release.html and came up with this table about PG major releases and their follow-on bug fix/minor releases: Version Release date # updates Days till final update Days till next major 6.0 1997-01-29 0 0 130 6.1 1997-06-08 1 44 116 6.2 1997-10-02 1 15 150 6.3 1998-03-01 2 37 243 6.4 1998-10-30 2 51 222 6.5 1999-06-09 3 126 334 7.0 2000-05-08 3 187 340 7.1 2001-04-13 3 124 297 7.2 2002-02-04 8 1190 296 7.3 2002-11-27 21 1867 355 7.4 2003-11-17 19+ ? 429 8.0 2005-01-19 15+ ? 293 8.1 2005-11-08 11+ ? 392 8.2 2006-12-05 7+ ? 426 8.3 2008-02-04 1+ ? ? It's pretty clear that there was a sea-change around 7.2/7.3 --- before that, nobody thought that PG releases were anything that might be long-lived. And there's nothing in this table that suggests we've really settled on a new lifespan ... other than that we're still putting out new majors at a constant rate, and the community hasn't got the resources or interest to maintain an ever-increasing number of back branches. regards, tom lane
On Thu, May 1, 2008 at 10:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Maybe I'm some crazy, radical DBA, but I've never had a version of > >> pgsql get EOLed out from underneath me. > > Just for fun, I did a bit of digging in the release notes > http://developer.postgresql.org/pgdocs/postgres/release.html > and came up with this table about PG major releases and their > follow-on bug fix/minor releases: > > Version Release date # updates Days till final update Days till next major > > 6.0 1997-01-29 0 0 130 > 6.1 1997-06-08 1 44 116 > 6.2 1997-10-02 1 15 150 > 6.3 1998-03-01 2 37 243 > 6.4 1998-10-30 2 51 222 > 6.5 1999-06-09 3 126 334 > 7.0 2000-05-08 3 187 340 > 7.1 2001-04-13 3 124 297 > 7.2 2002-02-04 8 1190 296 > 7.3 2002-11-27 21 1867 355 > 7.4 2003-11-17 19+ ? 429 > 8.0 2005-01-19 15+ ? 293 > 8.1 2005-11-08 11+ ? 392 > 8.2 2006-12-05 7+ ? 426 > 8.3 2008-02-04 1+ ? ? > > It's pretty clear that there was a sea-change around 7.2/7.3 --- > before that, nobody thought that PG releases were anything that > might be long-lived. And there's nothing in this table that > suggests we've really settled on a new lifespan ... other than that > we're still putting out new majors at a constant rate, and the community > hasn't got the resources or interest to maintain an ever-increasing > number of back branches. > > regards, tom lane Not really Postgres's problem, but for whatever its worth if I do the following on Debian stable: $apt-get install postgresql I get 7.4 . When I install Debian I generally expect the software to be supported for a long time. Perhaps it might make sense to declare it dead except for security issues? -- Rob Wultsch wultsch@gmail.com wultsch (aim)
On Fri, May 02, 2008 at 09:18:39AM -0700, Rob Wultsch wrote: > Not really Postgres's problem, but for whatever its worth if I do the > following on Debian stable: > $apt-get install postgresql > > I get 7.4 . When I install Debian I generally expect the software to > be supported for a long time. Perhaps it might make sense to declare > it dead except for security issues? I suspect this really is the Debian package maintainer's problem. Debian itself has a very high bar for changes after feature freeze. This is great from the point of view of stability, but I hope they (or their users) aren't expecting the coommunities producing the software to do their maintenance of old releases for them. Postgres is way better in that respect than some of the included software. (I confess I am a little astonished that the most recent stable release ended up with 7.4. I thought that at _least_ 8.0 was out prior to freeze. What happened?) A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
Andrew Sullivan wrote: > On Fri, May 02, 2008 at 09:18:39AM -0700, Rob Wultsch wrote: >> Not really Postgres's problem, but for whatever its worth if I do the >> following on Debian stable: >> $apt-get install postgresql >> >> I get 7.4 . When I install Debian I generally expect the software to >> be supported for a long time. Perhaps it might make sense to declare >> it dead except for security issues? > ... > > (I confess I am a little astonished that the most recent stable > release ended up with 7.4. I thought that at _least_ 8.0 was out > prior to freeze. What happened?) apt-get install postgresql-8.1 gives you 8.1 in etch (stable) in addition to 7.4. Cheers Tino
Attachment
On Fri, May 02, 2008 at 09:57:58PM +0200, Tino Wildenhain wrote: > apt-get install postgresql-8.1 gives you 8.1 in etch (stable) > in addition to 7.4. Huh. Debian gets more inscrutable every time I fail to look. A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
On Fri, 2 May 2008 15:48:13 -0400 Andrew Sullivan wrote: > On Fri, May 02, 2008 at 09:18:39AM -0700, Rob Wultsch wrote: > > Not really Postgres's problem, but for whatever its worth if I do the > > following on Debian stable: > > $apt-get install postgresql > > > > I get 7.4 . When I install Debian I generally expect the software to > > be supported for a long time. Perhaps it might make sense to declare > > it dead except for security issues? > > I suspect this really is the Debian package maintainer's problem. > Debian itself has a very high bar for changes after feature freeze. > This is great from the point of view of stability, but I hope they (or > their users) aren't expecting the coommunities producing the software > to do their maintenance of old releases for them. Postgres is way > better in that respect than some of the included software. That's true. I heard some stories about updating Firefox (or was it still Mozilla?) and changing version numbers in Debian. That was because the Mozilla guys seem to stop maintaining older versions the time a new minor version is released. With an upstream like that, you have a really hard time to support a package for the distribution lifetime - and things get worse if newer releases contain bugfixes which cannot ported back to the version used in the distribution. > (I confess I am a little astonished that the most recent stable > release ended up with 7.4. I thought that at _least_ 8.0 was out > prior to freeze. What happened?) It was, i think ... but after feature freeze for Debian 3.1 it took several month for the final release. Maybe the just released 8.0 was "too new" for the feature freeze. Finally they ended up with 7.4 for a very long time (before Debian 4.0 was released). Bye -- Andreas 'ads' Scherbaum German PostgreSQL User Group
On Fri, May 02, 2008 at 11:06:15PM +0200, Andreas 'ads' Scherbaum wrote: > > (I confess I am a little astonished that the most recent stable > > release ended up with 7.4. I thought that at _least_ 8.0 was out > > prior to freeze. What happened?) > > It was, i think ... but after feature freeze for Debian 3.1 it took > several month for the final release. Maybe the just released 8.0 was > "too new" for the feature freeze. Finally they ended up with 7.4 for > a very long time (before Debian 4.0 was released). The most recent stable release does include 8.1, but you have to explicitly ask for it (Debian supports parallel installation of multiple versions). That release was the first stable release to support that feature. I beleive the issue was that people who installed "postgresql" from the old release shouldn't be automatically be upgraded (and hence make their database unusable). After the last release "postgresql" became a virtual package which will always install the latest version available. But the long debian release cycle makes it a little odder than you might expect. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.