Thread: Enticing interns to PostgreSQL

Enticing interns to PostgreSQL

From
"Jim C. Nasby"
Date:
The email below about FreeBSD's involvement in Google's Summer of Code
got me thinking; would there be value in trying to attract college
students to working on either PostgreSQL development, or using
PostgreSQL in projects? Even though we missed getting in on the summer
of code this year, ISTM that we could try targeting colleges,
professors, and students directly. When it comes to development, I'm
sure there's any number of TODO items that would make great coursework,
for all different levels of students. As for using PostgreSQL, perhaps
we could get database classes together with projects that could use
help.

Thoughts?

----- Forwarded message from Murray Stokely <murray@freebsdmall.com> -----

The FreeBSD Project is pleased to announce its participation in the
Google "Summer of Code" program designed to introduce students to open
source software development.

The FreeBSD Project received over 350 applications, amongst which 18
projects have been selected for funding.

Unfortunately, due to the limited number of spots available, we were
unable to fund many first rate applications.  However, we encourage
students to work together with us all year round.  The FreeBSD Project
is always willing to help mentor students learn more about operating
system development through our normal community mailing lists and
development forums.  Contributing to an open source software project
is a valuable component of a computer science education and great
preparation for a career in software development.

More information about the student projects is available from the
FreeBSD Summer of Code Wiki here :

  http://wikitest.freebsd.org/moin.cgi/SummerOfCode2005

The Wiki will soon be updated with information about downloading the
work in progress with CVSup.

We'd like to close by thanking Google for their generosity and
congratulating the 18 talented students below.

- The FreeBSD Summer of Code Mentors

--

Student: Anders Persson <anders@cs.ucla.edu>
Summary: FreeBSD userland/kernel interface cleanups
Mentor:  Brooks Davis <brooks@FreeBSD.org>

Student: Andrew Turner <andrew@fubar.geek.nz>
Summary: Integrate BSD Installer
Mentor:  re@, ru@, jhb@

Student: Brian Wilson <polytopes@gmail.com>
Summary: UFS Journalling
Mentor:  scottl@

Student: Chris Jones <chris.jones@ualberta.ca>
Summary: Gvinum 'move', 'rename', etc..
Mentor:  le@FreeBSD.org, phk@FreeBSD.org

Student: Christoph Mathys <cmathys@bluewin.ch>
Summary: Rewriting CVSup in C, the Csup project
Mentor:  mux@FreeBSD.org

Student: Csaba Henk <csaba.henk@creo.hu>
Summary: SSH based networking filesystem
Mentor:  scottl@FreeBSD.org

Student: Dario Freni <saturnero@freesbie.org>
Summary: FreeSBIE integration
Mentor:  murray@FreeBSD / re@FreeBSD.org

Student: Emiliano Mennucci <s223560@studenti.ing.unipi.it>
Summary: pluggable disk scheduler
Mentor:  luigi@FreeBSD.org

Student: Ivan Voras <ivoras@gmail.com>
Summary: GEOM Journaling Layer (gjournal),
Mentor:  phk@FreeBSD.org, pjd@FreeBSD.org

Student: Jason Young <dintsoft@gmail.com>
Summary: powerd,
Mentor:  bruno@FreeBSD.org, njl@FreeBSD.org

Student: Michael Bushkov <bushman@rsu.ru>
Summary: nsswitch / caching daemon
Mentor:  brooks@FreeBSD.org, nectar@FreeBSD.org

Student: Paolo Pisati <p.pisati@oltrelinux.com>
Summary: improve libalias
Mentor:  luigi@FreeBSD.org

Student: R. Tyler Ballance <tyler@tamu.edu>
Summary: Implement MacOS launchd(8) for FreeBSD
Mentor:  murray@FreeBSD.org

Student: RuGang Xu <rugang@gmail.com>
Summary: K kernel meta-language project
Mentor:  gnn@FreeBSD.org, phk@FreeBSD.org

Student: Samy Al Bahra <samy@kerneled.org>
Summary: MAC
Mentor:  rwatson@FreeBSD.org

Student: Victor Cruceru <victor.cruceru@gmail.com>
Summary: SNMP monitoring
Mentor:  harti@FreeBSD.org

Student: Yanjun Wu <yanjun03@ios.cn>
Summary: SEBSD
Mentor:  rwatson@FreeBSD.org

Student: Emily Boyd <emily@emilyboyd.com>
Summary: website improvements
Mentor:  murray@FreeBSD.org
_______________________________________________
freebsd-announce@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-announce
To unsubscribe, send any mail to "freebsd-announce-unsubscribe@freebsd.org"


----- End forwarded message -----

--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Enticing interns to PostgreSQL

From
"Christopher A. Watford"
Date:
On 7/19/05, Jim C. Nasby <decibel@decibel.org> wrote:
> The email below about FreeBSD's involvement in Google's Summer of Code
> got me thinking; would there be value in trying to attract college
> students to working on either PostgreSQL development, or using
> PostgreSQL in projects? Even though we missed getting in on the summer
> of code this year, ISTM that we could try targeting colleges,
> professors, and students directly. When it comes to development, I'm
> sure there's any number of TODO items that would make great coursework,
> for all different levels of students. As for using PostgreSQL, perhaps
> we could get database classes together with projects that could use
> help.
>
> Thoughts?

I have been lurking on the list with the intent of finding a project
to undertake while I finish up my BS in CompSci. I will be bored out
of my mind with classes and would benefit greatly from working inside
pgsql. I'd be very interested in any sort of pgsql project where I
could be of help. I have quite a few sites/applications that would
benefit from enhancements to pgsql (2PC+replication, clustering,
master-master) so there is also that motive.

my 2c,

--
Christopher Watford
http://dorm.tunkeymicket.com


Re: Enticing interns to PostgreSQL

From
Jeff Davis
Date:
Jim C. Nasby wrote:
> The email below about FreeBSD's involvement in Google's Summer of Code
> got me thinking; would there be value in trying to attract college
> students to working on either PostgreSQL development, or using
> PostgreSQL in projects? Even though we missed getting in on the summer
> of code this year, ISTM that we could try targeting colleges,
> professors, and students directly. When it comes to development, I'm
> sure there's any number of TODO items that would make great coursework,
> for all different levels of students. As for using PostgreSQL, perhaps
> we could get database classes together with projects that could use
> help.
>
> Thoughts?
>

I am planning to take a database course at a major university this fall.
If you have any materials that might help I'd be happy to see if the
professor is interested. I don't know what the course uses currently,
but I expect there will be opportunities to mention PostgreSQL.

Also, there's a PostgreSQL project that I'm planning on working on (not
even much work left, really), so I'll see if the professor shows any
interest in my project.

Regards,
    Jeff Davis

Re: Enticing interns to PostgreSQL

From
"Jim C. Nasby"
Date:
On Wed, Jul 20, 2005 at 03:43:04PM -0700, Jeff Davis wrote:
> Jim C. Nasby wrote:
> > The email below about FreeBSD's involvement in Google's Summer of Code
> > got me thinking; would there be value in trying to attract college
> > students to working on either PostgreSQL development, or using
> > PostgreSQL in projects? Even though we missed getting in on the summer
> > of code this year, ISTM that we could try targeting colleges,
> > professors, and students directly. When it comes to development, I'm
> > sure there's any number of TODO items that would make great coursework,
> > for all different levels of students. As for using PostgreSQL, perhaps
> > we could get database classes together with projects that could use
> > help.
> >
> > Thoughts?
> >
>
> I am planning to take a database course at a major university this fall.
> If you have any materials that might help I'd be happy to see if the
> professor is interested. I don't know what the course uses currently,
> but I expect there will be opportunities to mention PostgreSQL.
>
> Also, there's a PostgreSQL project that I'm planning on working on (not
> even much work left, really), so I'll see if the professor shows any
> interest in my project.

Well, since the typical belief is that MySQL is a good choice for
things, http://sql-info.de/mysql/gotchas.html is probably a good start.
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

ENUM type

From
"Jim C. Nasby"
Date:
On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote:
> Chris Travers wrote:
> >>
> > How hard would it be to automatically create enum_ tables in the back
> > ground to emulate MySQL's enum type?  Sort of like we do with SERIAL
> > datatypes...  Part of the problem is that MySQL's enum type is so
> > braindead from a database design perspective that most of us would not
> > be interested in using it.  Emulating an int foreign key for another
> > created table might make it ok, though.
> >
>
> The thing that occurs to me is that if you really want the enum type in
> PostgreSQL (assuming that there exists a real need), a PostgreSQL person
> would create their own type. Or, if not, just create a wrapper function
> that handles the input/output display and call it explicitly.

OK, but compare the amount of work you just described to the simplicity
of using an enum. Enum is much easier and simpler for a developer. Of
course in most cases the MySQL way of doing it is (as has been
mentioned) stupid, but done in the normal, normalized way it would
remove a fair amount of additional work on the part of a developer:

- no need to manually define seperate table
- no need to define RI
- no need to manually map between ID and real values (though of course
  we should make it easy to get the ID too)

> So to me, the need seems very weak. However, if your goal is
> compatibility, I guess we need it. The problem is it's very difficult to
> do in a general way. We'd probably have to do it specifically for enum,
> and have it generate the types automatically on the fly. Someone would
> have to do some interesting things with the parser, too. Right now even
> the varchar() type, for instance, is kind of a hack.
>
> Ultimately to do it in a general way I think we'd need functions that
> return a type that can be used in a table definition. Aside from the
> many problems I don't know about, there are two other problems:
> (1) After the table (or column?) is dropped, we need to drop the type.
> (2) Functions currently don't support variable numbers of arguments, so
> enum still wouldn't be simple. We could do something kinda dumb-looking
> like:
> CREATE TABLE mytable (
>   color ENUM("red,green,blue,orange,purple,yellow");
> );
> And have the hypothetical ENUM function then parse the single argument
> and return a type that could be used by that table.
>
> Is this achievable with a reasonable amount of effort? Is this
> function-returning-a-type a reasonable behavior?
>
> If nothing else it would clean up the clutter of varchar() and the like,
> that currently use the hacked-in catalog entry "atttypmod" or something
> like that.

Hopefully someone on -hackers can shed light on what's required to clean
up the parsing. One thing worth noting though, is that table definition
is a relatively small part of doing a migration. Generally, it's
application code that causes the most issues. Because of this, I think
there would still be a lot of benefit to an enum type that didn't
strictly follow the mysql naming/definition convention. In this case, it
might be much easier to have an enum that doesn't allow you to define
what can go into it at creation time; ie:

CREATE TABLE ...
    blah ENUM NOT NULL ...
...

ALTER TABLE SET ENUM blah ALLOWED VALUES(1, 2, 4);
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: ENUM type

From
Jochem van Dieten
Date:
On 7/26/05, Jim C. Nasby wrote:
> On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote:
>>
>> Ultimately to do it in a general way I think we'd need functions that
>> return a type that can be used in a table definition. Aside from the
>> many problems I don't know about, there are two other problems:
>> (1) After the table (or column?) is dropped, we need to drop the type.
>> (2) Functions currently don't support variable numbers of arguments, so
>> enum still wouldn't be simple. We could do something kinda dumb-looking
>> like:
>> CREATE TABLE mytable (
>>   color ENUM("red,green,blue,orange,purple,yellow");
>> );
>> And have the hypothetical ENUM function then parse the single argument
>> and return a type that could be used by that table.

Wouldn't the following work already:
CREATE DOMAIN colors AS TEXT CHECK ( VALUE IN ('red', 'green', 'blue',
'orange', 'purple', 'yellow'));

CREATE TABLE mytable ( color COLORS
);


And this has all the advantages of having a single definition for your
domain in one place, while you can reuse the resulting domain in many
tables. I can't remember when I last deployed a PostgreSQL app without
domains for common data like email addresses, phone numbers and ZIP
codes.


> In this case, it
> might be much easier to have an enum that doesn't allow you to define
> what can go into it at creation time; ie:
>
> CREATE TABLE ...
>     blah ENUM NOT NULL ...
> ...
>
> ALTER TABLE SET ENUM blah ALLOWED VALUES(1, 2, 4);

What you are proposing is something PostgreSQL already has:
CREATE TABLE ...   blah TEXT NOT NULL ...
...;

ALTER TABLE ... ADD CONSTRAINT CHECK (blah IN (1,2,4));


ENUM is a braindead idea implemented because MySQL lacked the
infrastructure to let its users do the right thing. (Lets face it:
what percentage of the use of ENUM in MySQL would simply evaporate if
MySQL implemented a proper BOOLEAN datatype?) PostgreSQL has the
infrastructure to allow its users to do the right thing.


Working around ENUMs belongs in a migration guide and maybe in a
migration tool with examples of using a lookup table, a check
contraint and a domain. Working around ENUMs does not belong in the
source.

Jochem


Re: ENUM type

From
"Jim C. Nasby"
Date:
On Wed, Jul 27, 2005 at 12:11:47AM +0200, Jochem van Dieten wrote:
> On 7/26/05, Jim C. Nasby wrote:
> > On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote:
> >>
> >> Ultimately to do it in a general way I think we'd need functions that
> >> return a type that can be used in a table definition. Aside from the
> >> many problems I don't know about, there are two other problems:
> >> (1) After the table (or column?) is dropped, we need to drop the type.
> >> (2) Functions currently don't support variable numbers of arguments, so
> >> enum still wouldn't be simple. We could do something kinda dumb-looking
> >> like:
> >> CREATE TABLE mytable (
> >>   color ENUM("red,green,blue,orange,purple,yellow");
> >> );
> >> And have the hypothetical ENUM function then parse the single argument
> >> and return a type that could be used by that table.

> ENUM is a braindead idea implemented because MySQL lacked the
> infrastructure to let its users do the right thing. (Lets face it:
> what percentage of the use of ENUM in MySQL would simply evaporate if
> MySQL implemented a proper BOOLEAN datatype?) PostgreSQL has the
> infrastructure to allow its users to do the right thing.

Sorry, I should have been more clear. There is the MySQL migration issue
with their braindead enum, but what I was wondering about is creating a
'type' that is a rollup for:

- create parent table with int id field and text and indexes
- add RI to base table
- add triggers/views/rules/other glue to make the id field hidden and transparent to users in normal uses

In other words, for the common use case of a table that has a field that
can contain a relatively limited number of values, provide an easy means
to normalize those values out into a seperate table and allow
applications to use the text values as if the table was de-normalized.

The reason I cross-posted to hackers was to get an answer to the
question of how difficult it would be to allow the database to deal with
a type definition that involves some arbitrary number of variables, as
shown above in the color example.

Also, are there any external hooks for DDL? If there were then it should
be possible to add support for an enum type that creates the required
tables, views/rules, etc without modifying the backend.
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: ENUM type

From
Andrew Dunstan
Date:

Jim C. Nasby wrote:

>On Wed, Jul 27, 2005 at 12:11:47AM +0200, Jochem van Dieten wrote:
>  
>
>>On 7/26/05, Jim C. Nasby wrote:
>>    
>>
>>>On Tue, Jul 26, 2005 at 01:09:11PM -0700, Jeff Davis wrote:
>>>      
>>>
>>>>Ultimately to do it in a general way I think we'd need functions that
>>>>return a type that can be used in a table definition. Aside from the
>>>>many problems I don't know about, there are two other problems:
>>>>(1) After the table (or column?) is dropped, we need to drop the type.
>>>>(2) Functions currently don't support variable numbers of arguments, so
>>>>enum still wouldn't be simple. We could do something kinda dumb-looking
>>>>like:
>>>>CREATE TABLE mytable (
>>>>  color ENUM("red,green,blue,orange,purple,yellow");
>>>>);
>>>>And have the hypothetical ENUM function then parse the single argument
>>>>and return a type that could be used by that table.
>>>>        
>>>>
>
>  
>
>>ENUM is a braindead idea implemented because MySQL lacked the
>>infrastructure to let its users do the right thing. (Lets face it:
>>what percentage of the use of ENUM in MySQL would simply evaporate if
>>MySQL implemented a proper BOOLEAN datatype?) PostgreSQL has the
>>infrastructure to allow its users to do the right thing.
>>    
>>
>
>Sorry, I should have been more clear. There is the MySQL migration issue
>with their braindead enum, but what I was wondering about is creating a
>'type' that is a rollup for:
>
>- create parent table with int id field and text and indexes
>- add RI to base table
>- add triggers/views/rules/other glue to make the id field hidden and
>  transparent to users in normal uses
>
>In other words, for the common use case of a table that has a field that
>can contain a relatively limited number of values, provide an easy means
>to normalize those values out into a seperate table and allow
>applications to use the text values as if the table was de-normalized.
>
>The reason I cross-posted to hackers was to get an answer to the
>question of how difficult it would be to allow the database to deal with
>a type definition that involves some arbitrary number of variables, as
>shown above in the color example.
>
>Also, are there any external hooks for DDL? If there were then it should
>be possible to add support for an enum type that creates the required
>tables, views/rules, etc without modifying the backend.
>  
>


Your question assumes an implementation. My thought for enums instead 
was that it might be nice to provide support for dynamically created 
input/output functions for an enum type (written in, say, plperl or 
plpgsql). I have no idea how feasible this is either, but it could be 
quite nice.

cheers

andrew



Re: ENUM type

From
Tom Lane
Date:
"Jim C. Nasby" <decibel@decibel.org> writes:
> ... what I was wondering about is creating a
> 'type' that is a rollup for:

> - create parent table with int id field and text and indexes
> - add RI to base table
> - add triggers/views/rules/other glue to make the id field hidden and
>   transparent to users in normal uses

Given the difficulties we've had with SERIAL columns, this seems much
less than trivial :-(.  I'd be interested to see a good solution ---
I suspect it needs one or two ideas we haven't had yet.

In the meantime, I agree with Andrew's reply that the best stopgap is to
invent a bespoke datatype for each required ENUM set, with input and
output functions that have the list of values hardwired in.
        regards, tom lane


Re: ENUM type

From
Andrew Dunstan
Date:

Tom Lane wrote:

>"Jim C. Nasby" <decibel@decibel.org> writes:
>  
>
>>... what I was wondering about is creating a
>>'type' that is a rollup for:
>>    
>>
>
>  
>
>>- create parent table with int id field and text and indexes
>>- add RI to base table
>>- add triggers/views/rules/other glue to make the id field hidden and
>>  transparent to users in normal uses
>>    
>>
>
>Given the difficulties we've had with SERIAL columns, this seems much
>less than trivial :-(.  I'd be interested to see a good solution ---
>I suspect it needs one or two ideas we haven't had yet.
>
>In the meantime, I agree with Andrew's reply that the best stopgap is to
>invent a bespoke datatype for each required ENUM set, with input and
>output functions that have the list of values hardwired in.
>
>
>  
>

:-)

Can you expand a bit on how it might work? It's not totally clear to me 
:-)  Can we use an incomplete type as a parameter for anything except a 
C function? Maybe we could do it with a single C function that would 
retrieve the values from the catalog (extra col in pg_type?) and build 
(and cache) the translation tables.


This could be an excellent intern project, BTW.

cheers

andrew