Thread: Reminder: only 5 days left to submit SoC applications
Students & Professors, There are only 5 days left to submit your PostgreSQL Google Summer of Code Project: http://www.postgresql.org/developer/summerofcode.html If you aren't a student, but know a CS student interested in databases, testing, GUIs, or any other OSS coding, please point them to our SoC page and encourage them to apply right away! If you are a student, and you've been trying to perfect your application, please go ahead and submit it ... we can't help you if you miss the deadline, but we can help you fix an incomplete application. --Josh Berkus
Hi, If you are looking for a SoC idea, I have listed a couple below. I am not sure how good of an idea they are but I have ran into the following limitations and probably other people have as well in the past. 1. Can user based priorities be implemented as a summer project? To some extent it has already been implemented in research (http:// www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and scalable. 2. Distributed full-text indexing. This one I am really not sure how possible it is but (TSearch2) very scalable (cannot do multi terabyte fulltext indexes). Maybe some sort system could be devised to perform fulltext searches over multiple systems and merge the ranked results at some root node. Benjamin On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote: > Students & Professors, > > There are only 5 days left to submit your PostgreSQL Google Summer > of Code Project: > http://www.postgresql.org/developer/summerofcode.html > > If you aren't a student, but know a CS student interested in > databases, testing, GUIs, or any other OSS coding, please point > them to our SoC page and encourage them to apply right away! > > If you are a student, and you've been trying to perfect your > application, please go ahead and submit it ... we can't help you if > you miss the deadline, but we can help you fix an incomplete > application. > > --Josh Berkus > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
Hi, If you are looking for a SoC idea, I have listed a couple below. I am not sure how good of an idea they are but I have ran into the following limitations and probably other people have as well in the past. 1. Can user based priorities be implemented as a summer project? To some extent it has already been implemented in research (http:// www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and scalable. 2. Distributed full-text indexing. This one I am really not sure how possible it is but (TSearch2) very scalable (cannot do multi terabyte fulltext indexes). Maybe some sort system could be devised to perform fulltext searches over multiple systems and merge the ranked results at some root node. Benjamin On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote: > Students & Professors, > > There are only 5 days left to submit your PostgreSQL Google Summer > of Code Project: > http://www.postgresql.org/developer/summerofcode.html > > If you aren't a student, but know a CS student interested in > databases, testing, GUIs, or any other OSS coding, please point > them to our SoC page and encourage them to apply right away! > > If you are a student, and you've been trying to perfect your > application, please go ahead and submit it ... we can't help you if > you miss the deadline, but we can help you fix an incomplete > application. > > --Josh Berkus > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
me@benjaminarai.com (Benjamin Arai) writes: > If you are looking for a SoC idea, I have listed a couple below. I > am not sure how good of an idea they are but I have ran into the > following limitations and probably other people have as well in the > past. Actually, I have a thought on a SoC idea... The general notion would be to try to come up with some more rational information on setting the default column statistics width. http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET http://www.postgresql.org/docs/8.2/interactive/planner-stats.html Now, the default value has long been 10. There are cases where people find they need to set it higher; that has always been pretty trial-and-error. My suspicion is that: a) The default should probably be a bit higher than 10 b) Some analysis of stats and schema on an individual table could perhaps provide more specific values for specific columns. - Data type might provide guidance; there's little need for >3 values on a binary column, for instance. - If there is a NOT NULL UNIQUE constraint on a column, that might suggest > 10 values - If the column is known to have 150 unique values, that might suggest SET STATISTICS 150 It might be worth looking at the *least* frequently occuring values, and set stats high enough to make it likely that at least one such value would be pulled in... - Some kinds of values (dates, floats) are sorta continuous in value; having 10 bins may be pretty OK for such There are probably some other heuristics to be had; this is just some ideas off the top of my head. Nobody has gone through any sort of real analysis of this; there likely is merit to doing so... -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://cbbrowne.com/info/finances.html Where do you *not* want to go today? "Confutatis maledictis, flammis acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>