Thread: tsvector stemmer issue
ran into an interesting issue - and I’m not sure if anything can be done about it - the snowball stemmer treats “severance”and “several” as the same, which for me is a big, big issue. even quoting it doesn’t help. indie=> select to_tsvector('severance several'); to_tsvector ------------- 'sever':1,2 (1 row) indie=> select to_tsvector('"severance" several'); to_tsvector ------------- 'sever':1,2 (1 row) using the perl library Lingua::Stem::Snowball it yields the same results (as expected since they both use snowball). am I SOL here? — Jeff Trout <jeff@jefftrout.com>
Jeff Trout <threshar@real.jefftrout.com> wrote: > ran into an interesting issue - and I’m not sure if anything can > be done about it - the snowball stemmer treats “severance” and > “several” as the same, which for me is a big, big issue. You can create a custom dictionary chain. The only type I worked with was thesaurus, but it was pretty easy once I read the relevant docs. It is only custom *parsers* that are a pain, but it doesn't sound like you need that. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company