The following bug has been logged online:
Bug reference: 4306
Logged by: Yishai Lerner
Email address: yish@alum.mit.edu
PostgreSQL version: 8.3.1
Operating system: RHEL5 and MacOSX 10.4
Description: TSearch2 stemming, stop words and lexize behaviour
inconsistent
Details:
I would expect the behavior for to_tsquery for the three variations of
"what", "what's" and "whats" to be consistent and for all variations to be
ignored since they all result in a stop word of "what". However, this is
not the case as to_tsquery("whats") returns the stop word "what" as a
result. Even more confusing is that if one were to look at the lexize
results below, they are inconsistent with the to_tsquery results below.
This seems like a bug to me.
goodrec_2=# select lexize('en_stem', 'what''s');
lexize
--------
{what}
goodrec_2=# select lexize('en_stem', 'whats');
lexize
--------
{what}
goodrec_2=# select lexize('en_stem', 'what');
lexize
--------
{}
goodrec_2=# select to_tsquery('what''s');
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s),
ignored
to_tsquery
goodrec_2=# select to_tsquery('whats');
to_tsquery
------------
'what'
goodrec_2=# select to_tsquery('what');
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s),
ignored