Thread: Reminder: only 5 days left to submit SoC applications

Reminder: only 5 days left to submit SoC applications

From
Josh Berkus
Date:
Students & Professors,

There are only 5 days left to submit your PostgreSQL Google Summer of
Code Project:
http://www.postgresql.org/developer/summerofcode.html

If you aren't a student, but know a CS student interested in databases,
testing, GUIs, or any other OSS coding, please point them to our SoC
page and encourage them to apply right away!

If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if you
miss the deadline, but we can help you fix an incomplete application.

--Josh Berkus

SoC Ideas for people looking for projects

From
Benjamin Arai
Date:
Hi,

If you are looking for a SoC idea, I have listed a couple below.  I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.

1. Can user based priorities be implemented as a summer project?  To
some extent it has already been implemented in research (http://
www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and
scalable.

2. Distributed full-text indexing.  This one I am really not sure how
possible it is but  (TSearch2) very scalable (cannot do multi
terabyte fulltext indexes).  Maybe some sort system could be devised
to perform fulltext searches over multiple systems and merge the
ranked results at some root node.

Benjamin

On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote:

> Students & Professors,
>
> There are only 5 days left to submit your PostgreSQL Google Summer
> of Code Project:
> http://www.postgresql.org/developer/summerofcode.html
>
> If you aren't a student, but know a CS student interested in
> databases, testing, GUIs, or any other OSS coding, please point
> them to our SoC page and encourage them to apply right away!
>
> If you are a student, and you've been trying to perfect your
> application, please go ahead and submit it ... we can't help you if
> you miss the deadline, but we can help you fix an incomplete
> application.
>
> --Josh Berkus
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


SoC Ideas for people looking for projects

From
Benjamin Arai
Date:
Hi,

If you are looking for a SoC idea, I have listed a couple below.  I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.

1. Can user based priorities be implemented as a summer project?  To
some extent it has already been implemented in research (http://
www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and
scalable.

2. Distributed full-text indexing.  This one I am really not sure how
possible it is but  (TSearch2) very scalable (cannot do multi
terabyte fulltext indexes).  Maybe some sort system could be devised
to perform fulltext searches over multiple systems and merge the
ranked results at some root node.

Benjamin

On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote:

> Students & Professors,
>
> There are only 5 days left to submit your PostgreSQL Google Summer
> of Code Project:
> http://www.postgresql.org/developer/summerofcode.html
>
> If you aren't a student, but know a CS student interested in
> databases, testing, GUIs, or any other OSS coding, please point
> them to our SoC page and encourage them to apply right away!
>
> If you are a student, and you've been trying to perfect your
> application, please go ahead and submit it ... we can't help you if
> you miss the deadline, but we can help you fix an incomplete
> application.
>
> --Josh Berkus
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


Re: SoC Ideas for people looking for projects

From
Chris Browne
Date:
me@benjaminarai.com (Benjamin Arai) writes:
> If you are looking for a SoC idea, I have listed a couple below.  I
> am not sure how good of an idea they are but I have ran into the
> following limitations and probably other people have as well in the
> past.

Actually, I have a thought on a SoC idea...

The general notion would be to try to come up with some more rational
information on setting the default column statistics width.

http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET
http://www.postgresql.org/docs/8.2/interactive/planner-stats.html

Now, the default value has long been 10.  There are cases where people
find they need to set it higher; that has always been pretty
trial-and-error.

My suspicion is that:

a) The default should probably be a bit higher than 10

b) Some analysis of stats and schema on an individual table could
perhaps provide more specific values for specific columns.

 - Data type might provide guidance; there's little need for >3 values on
   a binary column, for instance.

 - If there is a NOT NULL UNIQUE constraint on a column, that might
   suggest > 10 values

 - If the column is known to have 150 unique values, that might
   suggest SET STATISTICS 150

   It might be worth looking at the *least* frequently occuring
   values, and set stats high enough to make it likely that at least
   one such value would be pulled in...

 - Some kinds of values (dates, floats) are sorta continuous in value;
   having 10 bins may be pretty OK for such

There are probably some other heuristics to be had; this is just some
ideas off the top of my head.

Nobody has gone through any sort of real analysis of this; there
likely is merit to doing so...
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/finances.html
Where do you  *not* want to go today?  "Confutatis maledictis, flammis
acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>