Re: BUG #14278: Problem searching spanish words with accent mark outside the stem - Mailing list pgsql-bugs

From Jaime Casanova
Subject Re: BUG #14278: Problem searching spanish words with accent mark outside the stem
Date
Msg-id CAJGNTeMvhD=0Pb0qK_A9PX-bqzhisK5gUSgD4dF23rF2DC_vsg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #14278: Problem searching spanish words with accent mark outside the stem  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
On 7 August 2016 at 23:58, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> paco@hernandezgomez.com wrote:
>
>> Search without accent mark is not working correctly when the accent mark=
 is
>> outside the stem of the word.
>
> I think it'd be better to apply unaccent() to both the stored text
> before ts_vectorization and to the query terms.   That would reliably
> remove all diacritics (e=C3=B1es too, though I suppose nobody would searc=
h
> for their =C3=B1and=C3=BAes by writing nand=C3=BA, so it's not as severe)=
.
>
>

problem is that unaccent() is stable so can't be in the index
expression, so OP would need to create a ts_vector field to store a
preprocessed version of the string (one in which ts_vector('spanish',
unaccent()) has been already executed. and query over that field.

<cough> or create an immutable version of unaccent() </cough>


--=20
Jaime Casanova                      www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #14278: Problem searching spanish words with accent mark outside the stem
Next
From: Johan Fredriksson
Date:
Subject: Re: [PERFORM] Create language plperlu Error