TSearch vs. Homebrew
http://www.symfony-project.com/askeet/21
How does this dead simple approach compare to TSearch performance /
scaling wise?
--
Regards,
Hannes Dorbath
On Tue, 27 Jun 2006, Hannes Dorbath wrote:
http://www.symfony-project.com/askeet/21
How does this dead simple approach compare to TSearch performance / scaling
wise?
You miss the main point in tsearch2 - full integration with database, i.e.,
full access to metadata, ACID.....
Lucene has no of these features, so it could use some well known optimization
and, and so, scales better. If you don't need ACID, metadata access, why
do you need database at all ?
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
On 27.06.2006 13:31, Oleg Bartunov wrote:
On Tue, 27 Jun 2006, Hannes Dorbath wrote:
http://www.symfony-project.com/askeet/21
How does this dead simple approach compare to TSearch performance /
scaling wise?You miss the main point in tsearch2 - full integration with database, i.e.,
full access to metadata, ACID..... Lucene has no of these features, so
it could use some well known optimization
and, and so, scales better. If you don't need ACID, metadata access, why
do you need database at all ?
Yes, I know the benefits of using TSearch :) (I'm using it on many
projects) I just found that article and wondered how well this simple
approach might scale. Sorry for wasting your time ;)
--
Regards,
Hannes Dorbath
On Tue, 27 Jun 2006, Hannes Dorbath wrote:
On 27.06.2006 13:31, Oleg Bartunov wrote:
On Tue, 27 Jun 2006, Hannes Dorbath wrote:
http://www.symfony-project.com/askeet/21
How does this dead simple approach compare to TSearch performance /
scaling wise?You miss the main point in tsearch2 - full integration with database, i.e.,
full access to metadata, ACID..... Lucene has no of these features, so it
could use some well known optimization
and, and so, scales better. If you don't need ACID, metadata access, why
do you need database at all ?Yes, I know the benefits of using TSearch :) (I'm using it on many projects)
I just found that article and wondered how well this simple approach might
scale. Sorry for wasting your time ;)
Sorry, I was a bit off-topic. Lucene scales as any inverted index based
engine. In 8.2 tsearch2 also has inverted index support, but we obey
relational approach and couldn't provide a whole set of optimization,
which file based engines could provide.
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
Oleg Bartunov wrote:
On Tue, 27 Jun 2006, Hannes Dorbath wrote:
http://www.symfony-project.com/askeet/21
How does this dead simple approach compare to TSearch performance /
scaling wise?Sorry, I was a bit off-topic. Lucene scales as any inverted index based
engine. In 8.2 tsearch2 also has inverted index support, but we obey
relational approach and couldn't provide a whole set of optimization,
which file based engines could provide.
If you read further down the article, you see that what the fellow is
actually doing seems to be not using Lucene, but instead setting up his
own text indexing, ie identifying words, stemming, making a table which
records which words appear in which record etc. Basically he seems to
have re-implemented tsearch2 in a mixture of PHP and MySQL. I can't
imagine how well (or badly...) that must perform for a large amount of
data. The comments at the end are amusing, one fellow quite touching in
his naivety, wondering how much effort it would be to turn the framework
as described into an open source competitor for Google.
My best guess as an answer to the original question is that this
approach would not scale very well at all, and certainly not as well as
tsearch2 (even though tsearch2 doesn't scale quite as well as one might
hope either). And for that matter, it's not all that simple - it seems
to be of a similar order of complexity to tsearch2. However, my
performance estimate is completely unfounded in any actual experience,
so I could be wrong.
Tim
--
-----------------------------------------------
Tim Allen tim@proximity.com.au
Proximity Pty Ltd http://www.proximity.com.au/