Thread: Weird ranking results with ts_rank
Hi everybody.
I'm implementing a solution that uses PostgreSQL's full text search capabilities and I have come across a particular set of results for ts_rank that don't seem to make sense according to the documentation. I have tried the following queries in PostgreSQL 10, 11 and 12.
In both cases only the word "box" is matching, but adding a non-matching word with OR to the query increases the ranking. If I keep adding more non-matching words with OR the ranking starts to decrease again, but I would imagine that the second option should have the highest score and it would start decreasing from there the more non-matching words I add.
Is there something I'm not understanding?
Thanks.
postgres=# select ts_rank(to_tsvector('search for a text box'), to_tsquery('circle | lot <-> box'));
ts_rank
-------------
0.020264236
(1 row)
postgres=# select ts_rank(to_tsvector('search for a text box'), to_tsquery('lot <-> box'));
ts_rank
---------
1e-20
(1 row)
--
ts_rank
-------------
0.020264236
(1 row)
postgres=# select ts_rank(to_tsvector('search for a text box'), to_tsquery('lot <-> box'));
ts_rank
---------
1e-20
(1 row)
On Fri, Nov 15, 2019 at 1:31 AM Javier Ayres <jayres@sophilabs.com> wrote:
Hi everybody.I'm implementing a solution that uses PostgreSQL's full text search capabilities and I have come across a particular set of results for ts_rank that don't seem to make sense according to the documentation.
While the documentation doesn't come out and say, my interpretation is that ts_rank assumes there is a match in the first place, and by implication is undefined/unspecified if there is no match.
?column?
----------
f
(1 row)
Cheers,
Jeff
Oh I see. I was working as if no match was the same as ts_rank=0.
Great advice. Thank you very much.
On Sat, Nov 16, 2019 at 2:22 PM Jeff Janes <jeff.janes@gmail.com> wrote:
On Fri, Nov 15, 2019 at 1:31 AM Javier Ayres <jayres@sophilabs.com> wrote:Hi everybody.I'm implementing a solution that uses PostgreSQL's full text search capabilities and I have come across a particular set of results for ts_rank that don't seem to make sense according to the documentation.While the documentation doesn't come out and say, my interpretation is that ts_rank assumes there is a match in the first place, and by implication is undefined/unspecified if there is no match.select to_tsvector('search for a text box') @@ to_tsquery('circle | lot <-> box');
?column?
----------
f
(1 row)Cheers,Jeff