Reminder: only 5 days left to submit SoC applications

Started by Josh Berkusalmost 19 years ago4 messages
#1Josh Berkus
josh@agliodbs.com

Students & Professors,

There are only 5 days left to submit your PostgreSQL Google Summer of
Code Project:
http://www.postgresql.org/developer/summerofcode.html

If you aren't a student, but know a CS student interested in databases,
testing, GUIs, or any other OSS coding, please point them to our SoC
page and encourage them to apply right away!

If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if you
miss the deadline, but we can help you fix an incomplete application.

--Josh Berkus

#2Benjamin Arai
me@benjaminarai.com
In reply to: Josh Berkus (#1)
SoC Ideas for people looking for projects

Hi,

If you are looking for a SoC idea, I have listed a couple below. I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.

1. Can user based priorities be implemented as a summer project? To
some extent it has already been implemented in research (http://
www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and
scalable.

2. Distributed full-text indexing. This one I am really not sure how
possible it is but (TSearch2) very scalable (cannot do multi
terabyte fulltext indexes). Maybe some sort system could be devised
to perform fulltext searches over multiple systems and merge the
ranked results at some root node.

Benjamin

On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote:

Show quoted text

Students & Professors,

There are only 5 days left to submit your PostgreSQL Google Summer
of Code Project:
http://www.postgresql.org/developer/summerofcode.html

If you aren't a student, but know a CS student interested in
databases, testing, GUIs, or any other OSS coding, please point
them to our SoC page and encourage them to apply right away!

If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if
you miss the deadline, but we can help you fix an incomplete
application.

--Josh Berkus

---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#3Benjamin Arai
benjamin@araisoft.com
In reply to: Josh Berkus (#1)
SoC Ideas for people looking for projects

Hi,

If you are looking for a SoC idea, I have listed a couple below. I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.

1. Can user based priorities be implemented as a summer project? To
some extent it has already been implemented in research (http://
www.cs.cmu.edu/~bianca/icde04.pdf), so it is definitely possible and
scalable.

2. Distributed full-text indexing. This one I am really not sure how
possible it is but (TSearch2) very scalable (cannot do multi
terabyte fulltext indexes). Maybe some sort system could be devised
to perform fulltext searches over multiple systems and merge the
ranked results at some root node.

Benjamin

On Mar 20, 2007, at 10:07 AM, Josh Berkus wrote:

Show quoted text

Students & Professors,

There are only 5 days left to submit your PostgreSQL Google Summer
of Code Project:
http://www.postgresql.org/developer/summerofcode.html

If you aren't a student, but know a CS student interested in
databases, testing, GUIs, or any other OSS coding, please point
them to our SoC page and encourage them to apply right away!

If you are a student, and you've been trying to perfect your
application, please go ahead and submit it ... we can't help you if
you miss the deadline, but we can help you fix an incomplete
application.

--Josh Berkus

---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#4Chris Browne
cbbrowne@acm.org
In reply to: Josh Berkus (#1)
Re: SoC Ideas for people looking for projects

me@benjaminarai.com (Benjamin Arai) writes:

If you are looking for a SoC idea, I have listed a couple below. I
am not sure how good of an idea they are but I have ran into the
following limitations and probably other people have as well in the
past.

Actually, I have a thought on a SoC idea...

The general notion would be to try to come up with some more rational
information on setting the default column statistics width.

http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET
http://www.postgresql.org/docs/8.2/interactive/planner-stats.html

Now, the default value has long been 10. There are cases where people
find they need to set it higher; that has always been pretty
trial-and-error.

My suspicion is that:

a) The default should probably be a bit higher than 10

b) Some analysis of stats and schema on an individual table could
perhaps provide more specific values for specific columns.

- Data type might provide guidance; there's little need for >3 values on
a binary column, for instance.

- If there is a NOT NULL UNIQUE constraint on a column, that might
suggest > 10 values

- If the column is known to have 150 unique values, that might
suggest SET STATISTICS 150

It might be worth looking at the *least* frequently occuring
values, and set stats high enough to make it likely that at least
one such value would be pulled in...

- Some kinds of values (dates, floats) are sorta continuous in value;
having 10 bins may be pretty OK for such

There are probably some other heuristics to be had; this is just some
ideas off the top of my head.

Nobody has gone through any sort of real analysis of this; there
likely is merit to doing so...
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/finances.html
Where do you *not* want to go today? "Confutatis maledictis, flammis
acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html&gt;