Re: PATCH: Update snowball stemmers - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PATCH: Update snowball stemmers
Date
Msg-id 31126.1537824999@sss.pgh.pa.us
Whole thread Raw
In response to Re: PATCH: Update snowball stemmers  (Arthur Zakirov <a.zakirov@postgrespro.ru>)
Responses Re: PATCH: Update snowball stemmers  (Arthur Zakirov <a.zakirov@postgrespro.ru>)
List pgsql-hackers
Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
> Ah, I see. I attached new version made with --no-renames. Will wait for
> what cfbot will say.

I reviewed and pushed this.

As a cross-check on the patch, I cloned the Snowball github repo
and built the derived files in it.  I noticed that they'd incorporated
several new stemmers since 2007 --- not only your Nepali one, but
half a dozen more besides.  Since the point here is (IMO) mostly to
follow their lead on what's interesting, I went ahead and added those
as well.

In short, therefore, the commit includes the Nepali stuff from your
other thread as well as what was in this one.

Although I added nepali.stop from the other patch, I've not done
anything about updating our other stopword lists.  Presumably those
are a bit obsolete by now as well.  I wonder if we can prevail on
the Snowball people to make those available in some less painful way
than scraping them off assorted web pages.  Ideally they'd stick them
into their git repo ...

            regards, tom lane


pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Collation versioning
Next
From: Lukas Fittl
Date:
Subject: Re: auto_explain: Include JIT output if applicable