On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote: > Last November snowball added support for Greek language [1]. Following > the instructions [2], I wrote a patch that adds fulltext search for > Greek in Postgres. The patch is attached.
I have committed a full sync from the upstream snowball repository, which pulled in the new greek stemmer.
Could you please clarify where you got the stopword list from? The README says those need to be downloaded separately, but I wasn't able to find the download location. It would be good to document this, for example in the commit message. I haven't committed the stopword list yet.
The list is based on an earlier publication with modification by me. All the relevant info is on github.
Disclaimer 1: The list has not been validated by an expert.
Disclaimer 2: There are more stop-words lists on the internet, but they are less complete and they also use ancient greek words. Furthermore, my testing showed that snowball needs to handle accents (tonous) and ς (teliko sigma) in a special way if you want the stemmer to work with capitalized words too.