Thread: [BUGS] BUG #14552: tsquery converts AND operator into OR when nested insideOR operations

[BUGS] BUG #14552: tsquery converts AND operator into OR when nested insideOR operations

From
bjorn@eventmy.com
Date:
The following bug has been logged on the website:

Bug reference:      14552
Logged by:          Bjorn Linder
Email address:      bjorn@eventmy.com
PostgreSQL version: 9.4.5
Operating system:   OS 10.11.6
Description:

Working correctly, no results: 
SELECT ts_rank(to_tsvector('lets eat a cat'), ('fat & bat | rat'::tsquery &&
'cat'::tsquery));
 ts_rank
---------
   1e-20
(1 row)

Should also yield no results:
SELECT ts_rank(to_tsvector('lets eat a fat cat'), ('fat & bat |
rat'::tsquery && 'cat'::tsquery));
  ts_rank
-----------
 0.0991032
(1 row)

Is this intended behavior? Is there a recommended way to nest AND operators
inside OR operations? The relevant documentation looks to be the same for
newer versions so I'm assuming this behavior hasn't been changed between
versions - let me know. Thanks!


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

bjorn@eventmy.com writes:
> Working correctly, no results: 
> SELECT ts_rank(to_tsvector('lets eat a cat'), ('fat & bat | rat'::tsquery &&
> 'cat'::tsquery));
>  ts_rank
> ---------
>    1e-20
> (1 row)

> Should also yield no results:
> SELECT ts_rank(to_tsvector('lets eat a fat cat'), ('fat & bat |
> rat'::tsquery && 'cat'::tsquery));
>   ts_rank
> -----------
>  0.0991032
> (1 row)

> Is this intended behavior?

Don't see what you find surprising about it?  ts_rank() is documented as

    Ranks vectors based on the frequency of their matching lexemes.

The first example has one lexeme that matches the query's lexemes,
the second has two.  It should get a higher ranking.

If you want to know whether the tsvector formally matches the query,
you should be applying the @@ operator.  ts_rank() is not a binary
yes/no thing, it's trying to identify stuff that is more or less
relevant to the query's terms.  At least from the documentation,
I'd suspect it pays no attention to the operators in the query.

In short: the intended use of ts_rank() is for sorting values that
have already passed an @@ match.  It's not a substitute for @@.

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs