Home > mailing lists

Re: Mailing list search engine: surprising missing results? - Mailing list pgsql-www

From	Tom Lane
Subject	Re: Mailing list search engine: surprising missing results?
Date	January 25, 2022 17:54:28
Msg-id	2274255.1643133268@sss.pgh.pa.us Whole thread Raw
In response to	Re: Mailing list search engine: surprising missing results? (Ivan Panchenko <i.panchenko@postgrespro.ru>)
Responses	Re: Mailing list search engine: surprising missing results?
List	pgsql-www

Tree view

Ivan Panchenko <i.panchenko@postgrespro.ru> writes:
> The actual explanation can be seen from comparing a tsvector with a tsquery.
> To avoid stemming effects, we use the simple configuration below.

> # select plainto_tsquery('simple','boyers-moore');

>             plainto_tsquery
> -------------------------------------
>   'boyers-moore' & 'boyers' & 'moore'

> # select to_tsvector('simple','boyers-moore-horspool');

>                           to_tsvector
> -------------------------------------------------------------
>   'boyers':2 'boyers-moore-horspool':1 'horspool':4 'moore':3

> Obviously, such tsvector does not match the above tsquery. I think,a better tsquery for this query would be

>   'boyers-moore' | ('boyers' & 'moore')

> May be, it is worth changing to_tsquery() behavior for such cases.

Changing the behavior of to_tsquery is certainly a lot less scary
than changing to_tsvector --- it wouldn't call the validity of
existing tsvector indexes into question.

I see that to_tsquery is even sillier than plainto_tsquery:

regression=# select to_tsquery('simple','boyers-moore');
               to_tsquery
-----------------------------------------
 'boyers-moore' <-> 'boyers' <-> 'moore'
(1 row)

which is absolutely not a sane translation.

It seems to me that in both cases we'd be better off generating
"'boyers' <-> 'moore'", without the compound token at all.
Maybe there's a case for the weaker 'boyers' & 'moore' translation,
but I think if people wanted that they'd just enter separate words.

            regards, tom lane

pgsql-www by date:

From: Magnus Hagander
Date: 25 January 2022, 17:03:59
Subject: Re: Update Commitfest requirements and README

From: Magnus Hagander
Date: 25 January 2022, 20:48:18
Subject: Re: [PATCHES] pglister: make organization name generic

Re: Mailing list search engine: surprising missing results? - Mailing list pgsql-www

Previous

Next