Re: Full text search bug ('russian' regconfig) - Mailing list pgsql-bugs

From Artur Zakirov
Subject Re: Full text search bug ('russian' regconfig)
Date
Msg-id 0f991eaf-2394-41a2-9d8e-c36aef35fbb1@gmail.com
Whole thread Raw
In response to Full text search bug ('russian' regconfig)  (egocenter <egocenter@yandex.ru>)
Responses Re: Full text search bug ('russian' regconfig)  (egocenter <egocenter@yandex.ru>)
List pgsql-bugs
Hello

On 2/19/2020 5:35 PM, egocenter wrote:
> Text search doesn't work correct with the EQUAL string in text and query (russian dictionary config),
> as you can see in example ts_vector receives different from ts_query lexemes for identical text:
> 
> tsv = 'дан':1 'магазин':2 'нужн':3 'посеща':4 'точн':5
> tsq = 'нужн' & 'точн' & 'дан' & 'посещаем' & 'магазин'

It is because you call to_tsvector() two times. 'russian' is a Snowball 
dictionary and it uses stemming algorithms to cut words ending. Your 
query works if to_tsvector() isn't called twice on the same text:

=# SELECT
   web_query_and @@ ts_title,
   web_query_and @@ 'зачем нужны точные данные о посещаемости магазинов',
   *
FROM
   (SELECT
     to_tsvector('russian', 'зачем нужны точные данные о посещаемости 
магазинов') AS ts_title,
     websearch_to_tsquery('russian', 'зачем нужны точные данные о 
посещаемости магазинов?') AS web_query_and
   ) AS main;

It gives 'true' for the first column.

-- 
Artur



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16264: Server closed the connection unexpectedly
Next
From: PG Bug reporting form
Date:
Subject: BUG #16268: SPI_getvalue requires IsTransactionState but TextDatumGetCString of SPI_getbinval - not!