BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd - Mailing list pgsql-bugs

From alex@hill.net.au
Subject BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd
Date
Msg-id E1V59Oo-0007mB-GA@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      8354
Logged by:          Alex Hill
Email address:      alex@hill.net.au
PostgreSQL version: 9.2.4
Operating system:   OS X 10.8.4 Mountain Lion
Description:

Hi all,


The docs for ts_rank_cd state:


"This function requires positional information in its input. Therefore it
will not work on "stripped" tsvector values — it will always return zero."


However if a tsvector contains some stripped lexemes and some non-stripped,
ts_rank_cd will rank extents including the non-stripped values.


For example, this evaluates to zero as expected:


    SELECT ts_rank_cd(strip(to_tsvector('text search')),
plainto_tsquery('text search'))




But this doesn't:


    SELECT ts_rank_cd(to_tsvector('text') || strip(to_tsvector('search')),
plainto_tsquery('text search'))




I think this is a bug, if not in the code then in the documentation, which
isn't clear on what happens when stripped and positioned lexemes are mixed
in one tsvector.


I would prefer that stripped lexemes were completely ignored by ts_rank_cd:
my use case is using this as a fifth pseudo-weight, which matches a @@ query
but doesn't add to a ts_rank_cd ranking.


What do you think?


Cheers,
Alex

pgsql-bugs by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: BUG #8293: There are no methods to convert json scalar text to text in v9.3 beta2
Next
From: grv87@yandex.ru
Date:
Subject: BUG #8355: PL/Python 3 can't convert infinity to PostgreSQL's value