BUG #18149: Incorrect lexeme for english token "proxy" - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18149: Incorrect lexeme for english token "proxy"
Date
Msg-id 18149-936d14e6fc76ca61@postgresql.org
Whole thread Raw
Responses Re: BUG #18149: Incorrect lexeme for english token "proxy"  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18149
Logged by:          Patrick Peralta
Email address:      pperalta@gmail.com
PostgreSQL version: 14.5
Operating system:   Linux
Description:

The english dictionary is using the lexeme "proxi" for the token "proxy". As
a result, the search term "proxy" is not yielding results for records that
contain this word.

# select * from ts_debug('english', 'proxy');
   alias   |   description   | token |  dictionaries  |  dictionary  |
lexemes
-----------+-----------------+-------+----------------+--------------+---------
 asciiword | Word, all ASCII | proxy | {english_stem} | english_stem |
{proxi}

I think this lexeme was chosen to support the plural of proxy which is
proxies. However there are other plurals where the root word is spelled
different and Postgres creates the correct lexeme such as:

# select * from ts_debug('english', 'goose');
   alias   |   description   | token |  dictionaries  |  dictionary  |
lexemes
-----------+-----------------+-------+----------------+--------------+---------
 asciiword | Word, all ASCII | goose | {english_stem} | english_stem |
{goos}

# select * from ts_debug('english', 'mouse');
   alias   |   description   | token |  dictionaries  |  dictionary  |
lexemes
-----------+-----------------+-------+----------------+--------------+---------
 asciiword | Word, all ASCII | mouse | {english_stem} | english_stem |
{mous}

I believe we can create our own dictionary as a workaround
(https://www.postgresql.org/docs/current/textsearch-dictionaries.html) but
I'm reporting this to see if using "proxi" for "proxy" is intentional.


pgsql-bugs by date:

Previous
From: Thomas Munro
Date:
Subject: Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Next
From: Peter Smith
Date:
Subject: Re: [16+] subscription can end up in inconsistent state