Re: Please comment on the following OpenFTS/tsearch2 issues! - Mailing list pgsql-general
From | Teodor Sigaev |
---|---|
Subject | Re: Please comment on the following OpenFTS/tsearch2 issues! |
Date | |
Msg-id | 444F1E86.3070008@sigaev.ru Whole thread Raw |
In response to | Please comment on the following OpenFTS/tsearch2 issues! ("Don Walker" <don.walker@versaterm.com>) |
Responses |
Re: Please comment on the following OpenFTS/tsearch2 issues!
|
List | pgsql-general |
> 1. While tsearch2 provides fairly complete boolean search expression support > with AND - &, OR - |, NOT - !, and grouping - (), OpenFTS appears to only > have support for ANDing search terms. Is there some reason it hasn't been > extended to support full tsearch2 search expressions? Has anyone modified > OpenFTS to do this? Historical and simplification. No more. We didn't modify OpenFTS... People often asks us about conversation text -> tsquery, so, in 8.2 will be plainto_tsquery() returning the same result as OpenFTS query parser. > 2. Neither OpenFTS or tsearch2 support exact phrase matching. I've seen the > workaround to support matching a single exact phrase by modifying the WHERE > clause with textcolumn ~* "exact phrase". Does this give reasonable > performance? Has anyone implemented exact phrase matching in complex search > expressions like ("exact phrase1" AND term1) OR (NOT "exact phrase2" AND > "exact phrase3") ? We didn't plan to develop phrase search unless we have clean idea to support complex query and compound words, look discussion at http://www.pgsql.ru/db/mw/msg.html?mid=2111601 > > 3. The following summarizes what I've read about performance and scalability > of OpenFTS and/or tsearch2: > > a) don't expect OpenFTS/tsearch2 to perform/scale as well as dedicated > search engines like Lucene, http://lucene.apache.org/, > http://archives.postgresql.org/pgsql-general/2002-05/msg01156.php. Yes, GiST index is good for online update, but has problem with big sets. We plan to add to 8.2 inverted index with which tsearch2 will work with comparable speed with Lucene... First version was already published, look for announce :) > b) OR queries are slower than AND queries, > http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/oscon_tsearch2/o > ptimization.html. Yes > Do you agree with this summary? If you are using either OpenFTS or tsearch2 > in production, has the performance been acceptable? For my application I > could be looking at several million documents averaging about 3 pages each > (I only have ballpark figures at present). We knows installation of tsearch2 working with 4 millions docs. > > 4. If you are using either OpenFTS or tsearch2 in production why did you > choose OpenFTS over tsearch2 or vice versa? One of the advantages of > tsearch2 that I can see is that, once you have setup your database and > indexed your documents, you can talk to the database directly from your > application using SQL without needing to go through Perl first. This assumes > that you're ok with tsearch2 search expression syntax so you can use > functions like to_tsquery. It also assumes that you don't need sophisticated > exact phrase matching. OpenFTS may work on another box than pgsql, OpenFTS may index file directly from file system. > > 5. Are there any scripts, tools, add-ons, etc. that you can recommend? We can tweak OpenFTS/tsearch2 for you. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
pgsql-general by date: