Thread: tsearch: how to get a list of stopwords?
Hi there, me again. How do I find the stopwords that tsearch uses in its standard configuration? I've looked at contrib/tsearch/dict/porter_english.dct and get a feeling it's somewhere in there but I can't decipher it. Any suggestions? Joerg
On Thu, 28 Aug 2003, Joerg Erdmenger wrote: > Hi there, > > me again. How do I find the stopwords that tsearch uses in its standard > configuration? I've looked at contrib/tsearch/dict/porter_english.dct and get > a feeling it's somewhere in there but I can't decipher it. Any suggestions? You're right. They're encoded in engstoptree :) I suggest you not bother with old tsearch and look to tsearch2 version which is much improved both in performance and flexibility. http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ Oleg > > Joerg > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
hi > > me again. How do I find the stopwords that tsearch uses in its standard > > configuration? I've looked at contrib/tsearch/dict/porter_english.dct and > > get a feeling it's somewhere in there but I can't decipher it. Any > > suggestions? > > You're right. They're encoded in engstoptree :) > I suggest you not bother with old tsearch and look to tsearch2 version > which is much improved both in performance and flexibility. > http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ > well, I would like but I've got to get it to work on a production server; I will try to get the admins to install it but I guess it will take some time - meanwhile - is there anyway to get to the list of stopwords so that I can build a filter for those as a temporary workaround? thanks Joerg
On Thu, 28 Aug 2003, Joerg Erdmenger wrote: > hi > > > > me again. How do I find the stopwords that tsearch uses in its standard > > > configuration? I've looked at contrib/tsearch/dict/porter_english.dct and > > > get a feeling it's somewhere in there but I can't decipher it. Any > > > suggestions? > > > > You're right. They're encoded in engstoptree :) > > I suggest you not bother with old tsearch and look to tsearch2 version > > which is much improved both in performance and flexibility. > > http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ > > > well, I would like but I've got to get it to work on a production server; I > will try to get the admins to install it but I guess it will take some time - > meanwhile - is there anyway to get to the list of stopwords so that I can > build a filter for those as a temporary workaround? tsearch2 could live with tsearch, so you may play with it. I attached english.stop file from OpenFTS distribution. But I'm not 100% sure it's the same as in portereng.c :) > > thanks > > Joerg > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83