Thread: Re: ENUM type
Jim C Nasby: > OK, but compare the amount of work you just described to the simplicity > of using an enum. Enum is much easier and simpler for a developer. Of > course in most cases the MySQL way of doing it is (as has been > mentioned) stupid, but done in the normal, normalized way it would > remove a fair amount of additional work on the part of a developer: > > - no need to manually define seperate table create table color (color text); insert into color values ('red'); -- etc > - no need to define RI references color > - no need to manually map between ID and real values (though of course if you use color as p-key this is unnecessary > we should make it easy to get the ID too) again, in this case the color should be the id and using a surrogate key is poor design. If that's too much typing you have the option of a check constraint which is superior to enum in just about every way, particularly if encapsulated into a domain (although inferior to RI solution, IMO). Merlin
On Tue, Jul 26, 2005 at 05:30:49PM -0400, Merlin Moncure wrote: > Jim C Nasby: > > OK, but compare the amount of work you just described to the > simplicity > > of using an enum. Enum is much easier and simpler for a developer. Of > > course in most cases the MySQL way of doing it is (as has been > > mentioned) stupid, but done in the normal, normalized way it would > > remove a fair amount of additional work on the part of a developer: > > > > - no need to manually define seperate table > create table color (color text); > insert into color values ('red'); -- etc > > > - no need to define RI > references color > > > - no need to manually map between ID and real values (though of course > if you use color as p-key this is unnecessary > > > we should make it easy to get the ID too) > again, in this case the color should be the id and using a surrogate key > is poor design. If that's too much typing you have the option of a > check constraint which is superior to enum in just about every way, > particularly if encapsulated into a domain (although inferior to RI > solution, IMO). Please go back through the original thread. The original comment was regarding adding an enum type that actually made sense; ie: it stored an int in the base table that referenced a parent table. The original poster was arguing that enum didn't really buy anything over doing that the manual method (create seperate table; fill that table with values; create base table with RI, and finally, make sure that everything that touches the base table can do mapping, or add a bunch of other stuff to do the mapping automatically). My argument is that simply specifying fieldname ENUM(...) is a heck of a lot easier on developers than doing all those manual steps, yet it still does things in the correct, normalized manner (unlike MySQL). -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
> > again, in this case the color should be the id and using a surrogate key > > is poor design. If that's too much typing you have the option of a > > check constraint which is superior to enum in just about every way, > > particularly if encapsulated into a domain (although inferior to RI > > solution, IMO). > > Please go back through the original thread. The original comment was > regarding adding an enum type that actually made sense; ie: it stored an > int in the base table that referenced a parent table. The original > poster was arguing that enum didn't really buy anything over doing that > the manual method (create seperate table; fill that table with values; > create base table with RI, and finally, make sure that everything that > touches the base table can do mapping, or add a bunch of other stuff to > do the mapping automatically). My argument is that simply specifying You're right, I did jump on this (mostly off-topic) thread. Also, I was not aware that the enum did an id-id relationship internally. Regardless, on the surface enum is essentially a check constraint. Check constraints offer the same functionality and while marginally more verbose they are much more powerful and more standard to boot. Putting my advocacy hat on I'd rather highlight PostgreSQL's very well implemented constraint system. However, if enum was implemented as syntax sugar for a constraint in a similar way that serial is syntax sugar for the default value then I suppose it would not be a bad idea. Merlin
Hi, Jim C. Nasby wrote: > Please go back through the original thread. The original comment was > regarding adding an enum type that actually made sense; ie: it stored an > int in the base table that referenced a parent table. The original > poster was arguing that enum didn't really buy anything over doing that > the manual method (create seperate table; fill that table with values; > create base table with RI, and finally, make sure that everything that > touches the base table can do mapping, or add a bunch of other stuff to > do the mapping automatically). My argument is that simply specifying > > fieldname ENUM(...) > > is a heck of a lot easier on developers than doing all those manual > steps, yet it still does things in the correct, normalized manner > (unlike MySQL). I don't want to go through the original thread, but you are probably missing a point, MySQL manual states [1]: ========== If you want to determine all possible values for an ENUM column, use SHOW COLUMNS FROM tbl_name LIKE enum_col and parse the ENUM definition in the second column of the output. ========== Are you actually going to also implement SHOW COLUMNS blah-blah-blah? If not, poor MySQL users will still need to rewrite their applications. They will also need to know the name of the automagically generated table from somewhere. [1] http://dev.mysql.com/doc/mysql/en/enum.html
On Wed, Jul 27, 2005 at 08:45:53AM -0400, Merlin Moncure wrote: > > > again, in this case the color should be the id and using a surrogate > key > > > is poor design. If that's too much typing you have the option of a > > > check constraint which is superior to enum in just about every way, > > > particularly if encapsulated into a domain (although inferior to RI > > > solution, IMO). > > > > Please go back through the original thread. The original comment was > > regarding adding an enum type that actually made sense; ie: it stored > an > > int in the base table that referenced a parent table. The original > > poster was arguing that enum didn't really buy anything over doing > that > > the manual method (create seperate table; fill that table with values; > > create base table with RI, and finally, make sure that everything that > > touches the base table can do mapping, or add a bunch of other stuff > to > > do the mapping automatically). My argument is that simply specifying > > You're right, I did jump on this (mostly off-topic) thread. Also, I was > not aware that the enum did an id-id relationship internally. > Regardless, on the surface enum is essentially a check constraint. > Check constraints offer the same functionality and while marginally more > verbose they are much more powerful and more standard to boot. > > Putting my advocacy hat on I'd rather highlight PostgreSQL's very well > implemented constraint system. However, if enum was implemented as > syntax sugar for a constraint in a similar way that serial is syntax > sugar for the default value then I suppose it would not be a bad idea. Keep in mind that my original point was to go after items that were the biggest headache for migration from mysql to postgresql. Enum was just an example I picked out of thin air. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
So, it seems to me that there is a fair bit of work to be done on helping people migrate MySQL to PostgreSQL. So far, the checklist I can see includes: * Maintaining conversion scripts * Reviewing pain points and looking at ways of mitigating them. * Building solid migration documentation * Providing porting frameworks This last item could have some potentials. For example, a source-code compatible library to map MySQL client lib system calls to libpq operations for the C API, PHP wrapper scripts, etc. My company will be providing a PHP wrapper script, and will be working on some documentation. Most of the documentation will be BSD-style licensed with the exception of the references to my company's services (which must be stripped out of non-verbatim reproductions). If anyone else sees any other interesting areas that need work, it might be worthwhile to discuss them as well. Best Wishes, Chris Travers Metatron Technology Consulting
> So far, the checklist I can see includes: > * Maintaining conversion scripts What I think we need is a C program that dumps directly from MySQL into PostgreSQL sql. ie. Take the mysqldump source code and just modify its output. Will inherit the MySQL license though :( Chris
KL- > What I think we need is a C program that dumps directly from MySQL into > PostgreSQL sql. Why C? PerlDBI or JDBC should be able to do this readily enough. -- Josh Berkus Aglio Database Solutions San Francisco
> I think one of the more difficult areas will be to convert unsigned > fields from mysql into postgres. For smaller sizes it is possible to > convert to postgres by moving one size up and using constraints to > restrict numbers to be positive, and possibly within the mysql range > too. But, the problem is unsigned bigint in mysql to postgresql. > There's not another larger integer size that can be used that would > allow the 18446744073709551615 (is that the max value?) max value > available in mysql. Or am I missing something? You'd just issue a warning...
Josh Berkus wrote: > KL- > > >>What I think we need is a C program that dumps directly from MySQL into >>PostgreSQL sql. > > > Why C? PerlDBI or JDBC should be able to do this readily enough. Uhmmm isn't that what fetters DBI-Link thing does? -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org
Christopher Kings-Lynne wrote: >> So far, the checklist I can see includes: >> * Maintaining conversion scripts > > > What I think we need is a C program that dumps directly from MySQL > into PostgreSQL sql. > > ie. Take the mysqldump source code and just modify its output. > > Will inherit the MySQL license though :( > It then occurred to me that a better answer would be something like a DBI perl script that has modules for storing important queries for various SQL database servers, and outputting the format in PostgreSQL syntax. That way one can have a general conversion toolkit that can be easily expanded and could also be used in porting Oracle, MS SQL, SQL-Lite, and other databases to PostgreSQL. In this way, you could also avoid the MySQL license even by MySQL AB's weird interpretation of derivative works. This way one could also have fairly smart logic in the conversion as well. Gee, your max unsigned bigint is over our bigint limit, lets use numeric instead or hmm... you have an ENUM type here. Lets use a VARCHAR() with a constraint for now. This logic could be easily tweeked by a decent programmer. I think that such a program should be somewhat trivial to write for at least tables and data definitions. Triggers, functions, etc. would be more difficult. But I think it is reasonable to expect that functions might have to be manually ported in most circumstances. Best Wishes, Chris Travers > Chris > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > >
Gregory Youngblood <pgcluster@netio.org> writes: > ... the problem is unsigned bigint in mysql to postgresql. > There's not another larger integer size that can be used that would > allow the 18446744073709551615 (is that the max value?) max value > available in mysql. Or am I missing something? You'd have to translate that to NUMERIC, which would work but would take a bit of a performance hit ... regards, tom lane
Gregory Youngblood <pgcluster@netio.org> writes: > On Jul 27, 2005, at 9:53 PM, Tom Lane wrote: >> You'd have to translate that to NUMERIC, which would work but would >> take a bit of a performance hit ... > The most common places I've seen unsigned bigint used have been > primary keys for tables where the counter is expected to basically > grow forever. I've also seen it used to store unique user id numbers > instead of varchar fields. [ shrug... ] So store it as plain bigint. There is not any real difference between 2^63 and 2^64 available values --- either way, we'll all be safely dead before overflow occurs. regards, tom lane
On Wed, Jul 27, 2005 at 08:34:56PM -0700, Joshua D. Drake wrote: > Josh Berkus wrote: > >KL- > > > > > >>What I think we need is a C program that dumps directly from MySQL into > >>PostgreSQL sql. > > > > > >Why C? PerlDBI or JDBC should be able to do this readily enough. > > Uhmmm isn't that what fetters DBI-Link thing does? Yikes! Another unintended use. :P Anybody interested in contributing to this? Cheers, D (who is ashamed about how long it's been since his last commit) -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote!
D- > Anybody interested in contributing to this? Interested, yes. Free time's a little harder to come by, as you know ... -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus wrote: > D- > > >>Anybody interested in contributing to this? > > > Interested, yes. Free time's a little harder to come by, as you know ... What is this thing "free time"? I have yet to see or hear of it. > -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org
On Thu, Jul 28, 2005 at 12:53:07AM -0400, Tom Lane wrote: > Gregory Youngblood <pgcluster@netio.org> writes: > > ... the problem is unsigned bigint in mysql to postgresql. > > There's not another larger integer size that can be used that would > > allow the 18446744073709551615 (is that the max value?) max value > > available in mysql. Or am I missing something? > > You'd have to translate that to NUMERIC, which would work but would > take a bit of a performance hit ... Is there any serious impediment to adding unsigned types to PostgreSQL? They should be as readily supported as signed, right? I don't think these would even have to be in core, if that's a concern. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Wed, Jul 27, 2005 at 09:16:04PM -0700, Chris Travers wrote: > Christopher Kings-Lynne wrote: > > >>So far, the checklist I can see includes: > >>* Maintaining conversion scripts > > > > > >What I think we need is a C program that dumps directly from MySQL > >into PostgreSQL sql. > > > >ie. Take the mysqldump source code and just modify its output. > > > >Will inherit the MySQL license though :( > > > > It then occurred to me that a better answer would be something like a > DBI perl script that has modules for storing important queries for > various SQL database servers, and outputting the format in PostgreSQL > syntax. That way one can have a general conversion toolkit that can be > easily expanded and could also be used in porting Oracle, MS SQL, > SQL-Lite, and other databases to PostgreSQL. In this way, you could > also avoid the MySQL license even by MySQL AB's weird interpretation of > derivative works. > > This way one could also have fairly smart logic in the conversion as > well. Gee, your max unsigned bigint is over our bigint limit, lets use > numeric instead or hmm... you have an ENUM type here. Lets use a > VARCHAR() with a constraint for now. This logic could be easily tweeked > by a decent programmer. I think that such a program should be somewhat > trivial to write for at least tables and data definitions. Triggers, > functions, etc. would be more difficult. But I think it is reasonable > to expect that functions might have to be manually ported in most > circumstances. The only downside I can think of to such an approach is that it will be much slower than directly reading a dump file of some kind. But, on the up-side, it's also possible that it could allow for minimal downtime migraitons to take place in the future (of course adding that support would take a lot of work, but theoretically it's possible). -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Fri, Jul 29, 2005 at 02:49:54PM -0500, Jim C. Nasby wrote: > On Thu, Jul 28, 2005 at 12:53:07AM -0400, Tom Lane wrote: > > Gregory Youngblood <pgcluster@netio.org> writes: > > > ... the problem is unsigned bigint in mysql to postgresql. > > > There's not another larger integer size that can be used that would > > > allow the 18446744073709551615 (is that the max value?) max value > > > available in mysql. Or am I missing something? > > > > You'd have to translate that to NUMERIC, which would work but would > > take a bit of a performance hit ... > > Is there any serious impediment to adding unsigned types to PostgreSQL? > They should be as readily supported as signed, right? I don't think > these would even have to be in core, if that's a concern. Tom mentioned a couple of days ago that the worst problems that would be created by adding more numeric types are solved as of 8.1. -- Alvaro Herrera (<alvherre[a]alvh.no-ip.org>) "Before you were born your parents weren't as boring as they are now. They got that way paying your bills, cleaning up your room and listening to you tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers
"Jim C. Nasby" <decibel@decibel.org> writes: > Is there any serious impediment to adding unsigned types to PostgreSQL? Quick, what's the datatype of 12345? And why? My guess is that transparent support for such a thing would require some tricky and fragile compromises in the numeric hierarchy's promotion rules. The handling of smallint is already pretty unsatisfactory because "12345" is considered int not smallint; adding three or four unsigned types would make that problem many times worse. > I don't think these would even have to be in core, if that's a > concern. Sure, if you don't need it to work reasonably nicely --- although the promotion rules are now embedded in pg_cast and hence modifiable, the initial determination of a datatype for a numeric literal is still hard-wired in the grammar. regards, tom lane
On Fri, Jul 29, 2005 at 03:57:48PM -0400, Tom Lane wrote: > "Jim C. Nasby" <decibel@decibel.org> writes: > > Is there any serious impediment to adding unsigned types to PostgreSQL? > > Quick, what's the datatype of 12345? And why? > > My guess is that transparent support for such a thing would require some > tricky and fragile compromises in the numeric hierarchy's promotion > rules. The handling of smallint is already pretty unsatisfactory > because "12345" is considered int not smallint; adding three or four > unsigned types would make that problem many times worse. ISTM that so long as you don't overflow or lose precision it shouldn't really matter what datatype is used to represent a number, until you use it someplace that specifies a datatype to use (such as a function or a table). I realize that function overloading gums this up a bit, but I don't think it's unreasonable to require the user to explicitly cast if they want an explicit function out of the overloaded set to be used (though throwing a warning if they don't cast might be a good idea). > > I don't think these would even have to be in core, if that's a > > concern. > > Sure, if you don't need it to work reasonably nicely --- although the > promotion rules are now embedded in pg_cast and hence modifiable, the > initial determination of a datatype for a numeric literal is still > hard-wired in the grammar. So what would those rules currently do when they find a value that exceedes a signed int? Or a signed bigint? -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"