Reminder: only 5 days left to submit SoC applications
Students & Professors,
There are only 5 days left to submit your PostgreSQL Google Summer of
Code Project:
http://www.postgresql.org/developer/summerofcode.html
If you aren't a student, but know a CS student interested in databases,
testing, GUIs, or any other OSS coding, please point them to our SoC
page and encourage them to apply right away!
If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if you
miss the deadline, but we can help you fix an incomplete application.
--Josh Berkus
Hi,
If you are looking for a SoC idea, I have listed a couple below. I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.
1. Can user based priorities be implemented as a summer project? To
some extent it has already been implemented in research (http://
www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and
scalable.
2. Distributed full-text indexing. This one I am really not sure how
possible it is but (TSearch2) very scalable (cannot do multi
terabyte fulltext indexes). Maybe some sort system could be devised
to perform fulltext searches over multiple systems and merge the
ranked results at some root node.
Benjamin
On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote:
Show quoted text
Students & Professors,
There are only 5 days left to submit your PostgreSQL Google Summer
of Code Project:
http://www.postgresql.org/developer/summerofcode.htmlIf you aren't a student, but know a CS student interested in
databases, testing, GUIs, or any other OSS coding, please point
them to our SoC page and encourage them to apply right away!If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if
you miss the deadline, but we can help you fix an incomplete
application.--Josh Berkus
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Hi,
If you are looking for a SoC idea, I have listed a couple below. I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.
1. Can user based priorities be implemented as a summer project? To
some extent it has already been implemented in research (http://
www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and
scalable.
2. Distributed full-text indexing. This one I am really not sure how
possible it is but (TSearch2) very scalable (cannot do multi
terabyte fulltext indexes). Maybe some sort system could be devised
to perform fulltext searches over multiple systems and merge the
ranked results at some root node.
Benjamin
On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote:
Show quoted text
Students & Professors,
There are only 5 days left to submit your PostgreSQL Google Summer
of Code Project:
http://www.postgresql.org/developer/summerofcode.htmlIf you aren't a student, but know a CS student interested in
databases, testing, GUIs, or any other OSS coding, please point
them to our SoC page and encourage them to apply right away!If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if
you miss the deadline, but we can help you fix an incomplete
application.--Josh Berkus
---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
me@benjaminarai.com (Benjamin Arai) writes:
If you are looking for a SoC idea, I have listed a couple below. I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.
Actually, I have a thought on a SoC idea...
The general notion would be to try to come up with some more rational
information on setting the default column statistics width.
http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET
http://www.postgresql.org/docs/8.2/interactive/planner-stats.html
Now, the default value has long been 10. There are cases where people
find they need to set it higher; that has always been pretty
trial-and-error.
My suspicion is that:
a) The default should probably be a bit higher than 10
b) Some analysis of stats and schema on an individual table could
perhaps provide more specific values for specific columns.
- Data type might provide guidance; there's little need for >3 values on
a binary column, for instance.
- If there is a NOT NULL UNIQUE constraint on a column, that might
suggest > 10 values
- If the column is known to have 150 unique values, that might
suggest SET STATISTICS 150
It might be worth looking at the *least* frequently occuring
values, and set stats high enough to make it likely that at least
one such value would be pulled in...
- Some kinds of values (dates, floats) are sorta continuous in value;
having 10 bins may be pretty OK for such
There are probably some other heuristics to be had; this is just some
ideas off the top of my head.
Nobody has gone through any sort of real analysis of this; there
likely is merit to doing so...
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/finances.html
Where do you *not* want to go today? "Confutatis maledictis, flammis
acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>