Thread: BUG #4306: TSearch2 stemming, stop words and lexize behaviour inconsistent
BUG #4306: TSearch2 stemming, stop words and lexize behaviour inconsistent
From
"Yishai Lerner"
Date:
The following bug has been logged online: Bug reference: 4306 Logged by: Yishai Lerner Email address: yish@alum.mit.edu PostgreSQL version: 8.3.1 Operating system: RHEL5 and MacOSX 10.4 Description: TSearch2 stemming, stop words and lexize behaviour inconsistent Details: I would expect the behavior for to_tsquery for the three variations of "what", "what's" and "whats" to be consistent and for all variations to be ignored since they all result in a stop word of "what". However, this is not the case as to_tsquery("whats") returns the stop word "what" as a result. Even more confusing is that if one were to look at the lexize results below, they are inconsistent with the to_tsquery results below. This seems like a bug to me. goodrec_2=# select lexize('en_stem', 'what''s'); lexize -------- {what} goodrec_2=# select lexize('en_stem', 'whats'); lexize -------- {what} goodrec_2=# select lexize('en_stem', 'what'); lexize -------- {} goodrec_2=# select to_tsquery('what''s'); NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored to_tsquery goodrec_2=# select to_tsquery('whats'); to_tsquery ------------ 'what' goodrec_2=# select to_tsquery('what'); NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored