Thread: TSearch vs. Homebrew
http://www.symfony-project.com/askeet/21 How does this dead simple approach compare to TSearch performance / scaling wise? -- Regards, Hannes Dorbath
On Tue, 27 Jun 2006, Hannes Dorbath wrote: > http://www.symfony-project.com/askeet/21 > > How does this dead simple approach compare to TSearch performance / scaling > wise? You miss the main point in tsearch2 - full integration with database, i.e., full access to metadata, ACID..... Lucene has no of these features, so it could use some well known optimization and, and so, scales better. If you don't need ACID, metadata access, why do you need database at all ? Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On 27.06.2006 13:31, Oleg Bartunov wrote: > On Tue, 27 Jun 2006, Hannes Dorbath wrote: > >> http://www.symfony-project.com/askeet/21 >> >> How does this dead simple approach compare to TSearch performance / >> scaling wise? > > You miss the main point in tsearch2 - full integration with database, i.e., > full access to metadata, ACID..... Lucene has no of these features, so > it could use some well known optimization > and, and so, scales better. If you don't need ACID, metadata access, why > do you need database at all ? Yes, I know the benefits of using TSearch :) (I'm using it on many projects) I just found that article and wondered how well this simple approach might scale. Sorry for wasting your time ;) -- Regards, Hannes Dorbath
On Tue, 27 Jun 2006, Hannes Dorbath wrote: > On 27.06.2006 13:31, Oleg Bartunov wrote: >> On Tue, 27 Jun 2006, Hannes Dorbath wrote: >> >>> http://www.symfony-project.com/askeet/21 >>> >>> How does this dead simple approach compare to TSearch performance / >>> scaling wise? >> >> You miss the main point in tsearch2 - full integration with database, i.e., >> full access to metadata, ACID..... Lucene has no of these features, so it >> could use some well known optimization >> and, and so, scales better. If you don't need ACID, metadata access, why >> do you need database at all ? > > Yes, I know the benefits of using TSearch :) (I'm using it on many projects) > I just found that article and wondered how well this simple approach might > scale. Sorry for wasting your time ;) Sorry, I was a bit off-topic. Lucene scales as any inverted index based engine. In 8.2 tsearch2 also has inverted index support, but we obey relational approach and couldn't provide a whole set of optimization, which file based engines could provide. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Oleg Bartunov wrote: >>> On Tue, 27 Jun 2006, Hannes Dorbath wrote: >>> >>>> http://www.symfony-project.com/askeet/21 >>>> >>>> How does this dead simple approach compare to TSearch performance / >>>> scaling wise? > > Sorry, I was a bit off-topic. Lucene scales as any inverted index based > engine. In 8.2 tsearch2 also has inverted index support, but we obey > relational approach and couldn't provide a whole set of optimization, > which file based engines could provide. If you read further down the article, you see that what the fellow is actually doing seems to be not using Lucene, but instead setting up his own text indexing, ie identifying words, stemming, making a table which records which words appear in which record etc. Basically he seems to have re-implemented tsearch2 in a mixture of PHP and MySQL. I can't imagine how well (or badly...) that must perform for a large amount of data. The comments at the end are amusing, one fellow quite touching in his naivety, wondering how much effort it would be to turn the framework as described into an open source competitor for Google. My best guess as an answer to the original question is that this approach would not scale very well at all, and certainly not as well as tsearch2 (even though tsearch2 doesn't scale quite as well as one might hope either). And for that matter, it's not all that simple - it seems to be of a similar order of complexity to tsearch2. However, my performance estimate is completely unfounded in any actual experience, so I could be wrong. Tim -- ----------------------------------------------- Tim Allen tim@proximity.com.au Proximity Pty Ltd http://www.proximity.com.au/