Thread: Enticing interns to PostgreSQL
The email below about FreeBSD's involvement in Google's Summer of Code got me thinking; would there be value in trying to attract college students to working on either PostgreSQL development, or using PostgreSQL in projects? Even though we missed getting in on the summer of code this year, ISTM that we could try targeting colleges, professors, and students directly. When it comes to development, I'm sure there's any number of TODO items that would make great coursework, for all different levels of students. As for using PostgreSQL, perhaps we could get database classes together with projects that could use help. Thoughts? ----- Forwarded message from Murray Stokely <murray@freebsdmall.com> ----- The FreeBSD Project is pleased to announce its participation in the Google "Summer of Code" program designed to introduce students to open source software development. The FreeBSD Project received over 350 applications, amongst which 18 projects have been selected for funding. Unfortunately, due to the limited number of spots available, we were unable to fund many first rate applications. However, we encourage students to work together with us all year round. The FreeBSD Project is always willing to help mentor students learn more about operating system development through our normal community mailing lists and development forums. Contributing to an open source software project is a valuable component of a computer science education and great preparation for a career in software development. More information about the student projects is available from the FreeBSD Summer of Code Wiki here : http://wikitest.freebsd.org/moin.cgi/SummerOfCode2005 The Wiki will soon be updated with information about downloading the work in progress with CVSup. We'd like to close by thanking Google for their generosity and congratulating the 18 talented students below. - The FreeBSD Summer of Code Mentors -- Student: Anders Persson <anders@cs.ucla.edu> Summary: FreeBSD userland/kernel interface cleanups Mentor: Brooks Davis <brooks@FreeBSD.org> Student: Andrew Turner <andrew@fubar.geek.nz> Summary: Integrate BSD Installer Mentor: re@, ru@, jhb@ Student: Brian Wilson <polytopes@gmail.com> Summary: UFS Journalling Mentor: scottl@ Student: Chris Jones <chris.jones@ualberta.ca> Summary: Gvinum 'move', 'rename', etc.. Mentor: le@FreeBSD.org, phk@FreeBSD.org Student: Christoph Mathys <cmathys@bluewin.ch> Summary: Rewriting CVSup in C, the Csup project Mentor: mux@FreeBSD.org Student: Csaba Henk <csaba.henk@creo.hu> Summary: SSH based networking filesystem Mentor: scottl@FreeBSD.org Student: Dario Freni <saturnero@freesbie.org> Summary: FreeSBIE integration Mentor: murray@FreeBSD / re@FreeBSD.org Student: Emiliano Mennucci <s223560@studenti.ing.unipi.it> Summary: pluggable disk scheduler Mentor: luigi@FreeBSD.org Student: Ivan Voras <ivoras@gmail.com> Summary: GEOM Journaling Layer (gjournal), Mentor: phk@FreeBSD.org, pjd@FreeBSD.org Student: Jason Young <dintsoft@gmail.com> Summary: powerd, Mentor: bruno@FreeBSD.org, njl@FreeBSD.org Student: Michael Bushkov <bushman@rsu.ru> Summary: nsswitch / caching daemon Mentor: brooks@FreeBSD.org, nectar@FreeBSD.org Student: Paolo Pisati <p.pisati@oltrelinux.com> Summary: improve libalias Mentor: luigi@FreeBSD.org Student: R. Tyler Ballance <tyler@tamu.edu> Summary: Implement MacOS launchd(8) for FreeBSD Mentor: murray@FreeBSD.org Student: RuGang Xu <rugang@gmail.com> Summary: K kernel meta-language project Mentor: gnn@FreeBSD.org, phk@FreeBSD.org Student: Samy Al Bahra <samy@kerneled.org> Summary: MAC Mentor: rwatson@FreeBSD.org Student: Victor Cruceru <victor.cruceru@gmail.com> Summary: SNMP monitoring Mentor: harti@FreeBSD.org Student: Yanjun Wu <yanjun03@ios.cn> Summary: SEBSD Mentor: rwatson@FreeBSD.org Student: Emily Boyd <emily@emilyboyd.com> Summary: website improvements Mentor: murray@FreeBSD.org _______________________________________________ freebsd-announce@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-announce To unsubscribe, send any mail to "freebsd-announce-unsubscribe@freebsd.org" ----- End forwarded message ----- -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On 7/19/05, Jim C. Nasby <decibel@decibel.org> wrote: > The email below about FreeBSD's involvement in Google's Summer of Code > got me thinking; would there be value in trying to attract college > students to working on either PostgreSQL development, or using > PostgreSQL in projects? Even though we missed getting in on the summer > of code this year, ISTM that we could try targeting colleges, > professors, and students directly. When it comes to development, I'm > sure there's any number of TODO items that would make great coursework, > for all different levels of students. As for using PostgreSQL, perhaps > we could get database classes together with projects that could use > help. > > Thoughts? I have been lurking on the list with the intent of finding a project to undertake while I finish up my BS in CompSci. I will be bored out of my mind with classes and would benefit greatly from working inside pgsql. I'd be very interested in any sort of pgsql project where I could be of help. I have quite a few sites/applications that would benefit from enhancements to pgsql (2PC+replication, clustering, master-master) so there is also that motive. my 2c, -- Christopher Watford http://dorm.tunkeymicket.com
Jim C. Nasby wrote: > The email below about FreeBSD's involvement in Google's Summer of Code > got me thinking; would there be value in trying to attract college > students to working on either PostgreSQL development, or using > PostgreSQL in projects? Even though we missed getting in on the summer > of code this year, ISTM that we could try targeting colleges, > professors, and students directly. When it comes to development, I'm > sure there's any number of TODO items that would make great coursework, > for all different levels of students. As for using PostgreSQL, perhaps > we could get database classes together with projects that could use > help. > > Thoughts? > I am planning to take a database course at a major university this fall. If you have any materials that might help I'd be happy to see if the professor is interested. I don't know what the course uses currently, but I expect there will be opportunities to mention PostgreSQL. Also, there's a PostgreSQL project that I'm planning on working on (not even much work left, really), so I'll see if the professor shows any interest in my project. Regards, Jeff Davis
On Wed, Jul 20, 2005 at 03:43:04PM -0700, Jeff Davis wrote: > Jim C. Nasby wrote: > > The email below about FreeBSD's involvement in Google's Summer of Code > > got me thinking; would there be value in trying to attract college > > students to working on either PostgreSQL development, or using > > PostgreSQL in projects? Even though we missed getting in on the summer > > of code this year, ISTM that we could try targeting colleges, > > professors, and students directly. When it comes to development, I'm > > sure there's any number of TODO items that would make great coursework, > > for all different levels of students. As for using PostgreSQL, perhaps > > we could get database classes together with projects that could use > > help. > > > > Thoughts? > > > > I am planning to take a database course at a major university this fall. > If you have any materials that might help I'd be happy to see if the > professor is interested. I don't know what the course uses currently, > but I expect there will be opportunities to mention PostgreSQL. > > Also, there's a PostgreSQL project that I'm planning on working on (not > even much work left, really), so I'll see if the professor shows any > interest in my project. Well, since the typical belief is that MySQL is a good choice for things, http://sql-info.de/mysql/gotchas.html is probably a good start. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote: > Chris Travers wrote: > >> > > How hard would it be to automatically create enum_ tables in the back > > ground to emulate MySQL's enum type? Sort of like we do with SERIAL > > datatypes... Part of the problem is that MySQL's enum type is so > > braindead from a database design perspective that most of us would not > > be interested in using it. Emulating an int foreign key for another > > created table might make it ok, though. > > > > The thing that occurs to me is that if you really want the enum type in > PostgreSQL (assuming that there exists a real need), a PostgreSQL person > would create their own type. Or, if not, just create a wrapper function > that handles the input/output display and call it explicitly. OK, but compare the amount of work you just described to the simplicity of using an enum. Enum is much easier and simpler for a developer. Of course in most cases the MySQL way of doing it is (as has been mentioned) stupid, but done in the normal, normalized way it would remove a fair amount of additional work on the part of a developer: - no need to manually define seperate table - no need to define RI - no need to manually map between ID and real values (though of course we should make it easy to get the ID too) > So to me, the need seems very weak. However, if your goal is > compatibility, I guess we need it. The problem is it's very difficult to > do in a general way. We'd probably have to do it specifically for enum, > and have it generate the types automatically on the fly. Someone would > have to do some interesting things with the parser, too. Right now even > the varchar() type, for instance, is kind of a hack. > > Ultimately to do it in a general way I think we'd need functions that > return a type that can be used in a table definition. Aside from the > many problems I don't know about, there are two other problems: > (1) After the table (or column?) is dropped, we need to drop the type. > (2) Functions currently don't support variable numbers of arguments, so > enum still wouldn't be simple. We could do something kinda dumb-looking > like: > CREATE TABLE mytable ( > color ENUM("red,green,blue,orange,purple,yellow"); > ); > And have the hypothetical ENUM function then parse the single argument > and return a type that could be used by that table. > > Is this achievable with a reasonable amount of effort? Is this > function-returning-a-type a reasonable behavior? > > If nothing else it would clean up the clutter of varchar() and the like, > that currently use the hacked-in catalog entry "atttypmod" or something > like that. Hopefully someone on -hackers can shed light on what's required to clean up the parsing. One thing worth noting though, is that table definition is a relatively small part of doing a migration. Generally, it's application code that causes the most issues. Because of this, I think there would still be a lot of benefit to an enum type that didn't strictly follow the mysql naming/definition convention. In this case, it might be much easier to have an enum that doesn't allow you to define what can go into it at creation time; ie: CREATE TABLE ... blah ENUM NOT NULL ... ... ALTER TABLE SET ENUM blah ALLOWED VALUES(1, 2, 4); -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On 7/26/05, Jim C. Nasby wrote: > On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote: >> >> Ultimately to do it in a general way I think we'd need functions that >> return a type that can be used in a table definition. Aside from the >> many problems I don't know about, there are two other problems: >> (1) After the table (or column?) is dropped, we need to drop the type. >> (2) Functions currently don't support variable numbers of arguments, so >> enum still wouldn't be simple. We could do something kinda dumb-looking >> like: >> CREATE TABLE mytable ( >> color ENUM("red,green,blue,orange,purple,yellow"); >> ); >> And have the hypothetical ENUM function then parse the single argument >> and return a type that could be used by that table. Wouldn't the following work already: CREATE DOMAIN colors AS TEXT CHECK ( VALUE IN ('red', 'green', 'blue', 'orange', 'purple', 'yellow')); CREATE TABLE mytable ( color COLORS ); And this has all the advantages of having a single definition for your domain in one place, while you can reuse the resulting domain in many tables. I can't remember when I last deployed a PostgreSQL app without domains for common data like email addresses, phone numbers and ZIP codes. > In this case, it > might be much easier to have an enum that doesn't allow you to define > what can go into it at creation time; ie: > > CREATE TABLE ... > blah ENUM NOT NULL ... > ... > > ALTER TABLE SET ENUM blah ALLOWED VALUES(1, 2, 4); What you are proposing is something PostgreSQL already has: CREATE TABLE ... blah TEXT NOT NULL ... ...; ALTER TABLE ... ADD CONSTRAINT CHECK (blah IN (1,2,4)); ENUM is a braindead idea implemented because MySQL lacked the infrastructure to let its users do the right thing. (Lets face it: what percentage of the use of ENUM in MySQL would simply evaporate if MySQL implemented a proper BOOLEAN datatype?) PostgreSQL has the infrastructure to allow its users to do the right thing. Working around ENUMs belongs in a migration guide and maybe in a migration tool with examples of using a lookup table, a check contraint and a domain. Working around ENUMs does not belong in the source. Jochem
On Wed, Jul 27, 2005 at 12:11:47AM +0200, Jochem van Dieten wrote: > On 7/26/05, Jim C. Nasby wrote: > > On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote: > >> > >> Ultimately to do it in a general way I think we'd need functions that > >> return a type that can be used in a table definition. Aside from the > >> many problems I don't know about, there are two other problems: > >> (1) After the table (or column?) is dropped, we need to drop the type. > >> (2) Functions currently don't support variable numbers of arguments, so > >> enum still wouldn't be simple. We could do something kinda dumb-looking > >> like: > >> CREATE TABLE mytable ( > >> color ENUM("red,green,blue,orange,purple,yellow"); > >> ); > >> And have the hypothetical ENUM function then parse the single argument > >> and return a type that could be used by that table. > ENUM is a braindead idea implemented because MySQL lacked the > infrastructure to let its users do the right thing. (Lets face it: > what percentage of the use of ENUM in MySQL would simply evaporate if > MySQL implemented a proper BOOLEAN datatype?) PostgreSQL has the > infrastructure to allow its users to do the right thing. Sorry, I should have been more clear. There is the MySQL migration issue with their braindead enum, but what I was wondering about is creating a 'type' that is a rollup for: - create parent table with int id field and text and indexes - add RI to base table - add triggers/views/rules/other glue to make the id field hidden and transparent to users in normal uses In other words, for the common use case of a table that has a field that can contain a relatively limited number of values, provide an easy means to normalize those values out into a seperate table and allow applications to use the text values as if the table was de-normalized. The reason I cross-posted to hackers was to get an answer to the question of how difficult it would be to allow the database to deal with a type definition that involves some arbitrary number of variables, as shown above in the color example. Also, are there any external hooks for DDL? If there were then it should be possible to add support for an enum type that creates the required tables, views/rules, etc without modifying the backend. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
Jim C. Nasby wrote: >On Wed, Jul 27, 2005 at 12:11:47AM +0200, Jochem van Dieten wrote: > > >>On 7/26/05, Jim C. Nasby wrote: >> >> >>>On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote: >>> >>> >>>>Ultimately to do it in a general way I think we'd need functions that >>>>return a type that can be used in a table definition. Aside from the >>>>many problems I don't know about, there are two other problems: >>>>(1) After the table (or column?) is dropped, we need to drop the type. >>>>(2) Functions currently don't support variable numbers of arguments, so >>>>enum still wouldn't be simple. We could do something kinda dumb-looking >>>>like: >>>>CREATE TABLE mytable ( >>>> color ENUM("red,green,blue,orange,purple,yellow"); >>>>); >>>>And have the hypothetical ENUM function then parse the single argument >>>>and return a type that could be used by that table. >>>> >>>> > > > >>ENUM is a braindead idea implemented because MySQL lacked the >>infrastructure to let its users do the right thing. (Lets face it: >>what percentage of the use of ENUM in MySQL would simply evaporate if >>MySQL implemented a proper BOOLEAN datatype?) PostgreSQL has the >>infrastructure to allow its users to do the right thing. >> >> > >Sorry, I should have been more clear. There is the MySQL migration issue >with their braindead enum, but what I was wondering about is creating a >'type' that is a rollup for: > >- create parent table with int id field and text and indexes >- add RI to base table >- add triggers/views/rules/other glue to make the id field hidden and > transparent to users in normal uses > >In other words, for the common use case of a table that has a field that >can contain a relatively limited number of values, provide an easy means >to normalize those values out into a seperate table and allow >applications to use the text values as if the table was de-normalized. > >The reason I cross-posted to hackers was to get an answer to the >question of how difficult it would be to allow the database to deal with >a type definition that involves some arbitrary number of variables, as >shown above in the color example. > >Also, are there any external hooks for DDL? If there were then it should >be possible to add support for an enum type that creates the required >tables, views/rules, etc without modifying the backend. > > Your question assumes an implementation. My thought for enums instead was that it might be nice to provide support for dynamically created input/output functions for an enum type (written in, say, plperl or plpgsql). I have no idea how feasible this is either, but it could be quite nice. cheers andrew
"Jim C. Nasby" <decibel@decibel.org> writes: > ... what I was wondering about is creating a > 'type' that is a rollup for: > - create parent table with int id field and text and indexes > - add RI to base table > - add triggers/views/rules/other glue to make the id field hidden and > transparent to users in normal uses Given the difficulties we've had with SERIAL columns, this seems much less than trivial :-(. I'd be interested to see a good solution --- I suspect it needs one or two ideas we haven't had yet. In the meantime, I agree with Andrew's reply that the best stopgap is to invent a bespoke datatype for each required ENUM set, with input and output functions that have the list of values hardwired in. regards, tom lane
Tom Lane wrote: >"Jim C. Nasby" <decibel@decibel.org> writes: > > >>... what I was wondering about is creating a >>'type' that is a rollup for: >> >> > > > >>- create parent table with int id field and text and indexes >>- add RI to base table >>- add triggers/views/rules/other glue to make the id field hidden and >> transparent to users in normal uses >> >> > >Given the difficulties we've had with SERIAL columns, this seems much >less than trivial :-(. I'd be interested to see a good solution --- >I suspect it needs one or two ideas we haven't had yet. > >In the meantime, I agree with Andrew's reply that the best stopgap is to >invent a bespoke datatype for each required ENUM set, with input and >output functions that have the list of values hardwired in. > > > > :-) Can you expand a bit on how it might work? It's not totally clear to me :-) Can we use an incomplete type as a parameter for anything except a C function? Maybe we could do it with a single C function that would retrieve the values from the catalog (extra col in pg_type?) and build (and cache) the translation tables. This could be an excellent intern project, BTW. cheers andrew