Thread: BUG #6654: Full text search doesn't find europe
The following bug has been logged on the website: Bug reference: 6654 Logged by: wbrana Email address: wbrana@gmail.com PostgreSQL version: 9.1.3 Operating system: Linux Description:=20=20=20=20=20=20=20=20 CREATE INDEX idx_post_text ON posts USING gin (to_tsvector('english'::regconfig, post_text::text))=20 select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe' returns no rows, but select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan' returns row with "Japan and Europe"
On Monday, May 21, 2012 07:26:38 PM wbrana@gmail.com wrote: > The following bug has been logged on the website: > > Bug reference: 6654 > Logged by: wbrana > Email address: wbrana@gmail.com > PostgreSQL version: 9.1.3 > Operating system: Linux > Description: > > CREATE INDEX idx_post_text ON posts USING gin > (to_tsvector('english'::regconfig, post_text::text)) > select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe' > returns no rows, but > select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan' > returns row with "Japan and Europe" The problem is that youre using to_tsvector('english' for parsing the text but don't specify the text yearch configuration for the query. The default english configuration does stemming, the default_text_search_configuration obviously not. Try ... to_tsvector('english', post_text) @@ to_tsquery('english', 'europe') Andres
Andres Freund <andres@anarazel.de> writes: > On Monday, May 21, 2012 07:26:38 PM wbrana@gmail.com wrote: >> CREATE INDEX idx_post_text ON posts USING gin >> (to_tsvector('english'::regconfig, post_text::text)) >> select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe' >> returns no rows, but >> select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan' >> returns row with "Japan and Europe" > The problem is that youre using to_tsvector('english' for parsing the text but > don't specify the text yearch configuration for the query. The default english > configuration does stemming, the default_text_search_configuration obviously > not. > Try ... to_tsvector('english', post_text) @@ to_tsquery('english', 'europe') BTW, a good way to debug this sort of issue is to look at the actual tsvector and tsquery values. regression=# select to_tsvector('english', 'Japan and Europe'); to_tsvector --------------------- 'europ':3 'japan':1 (1 row) regression=# select to_tsquery('english', 'Japan'); to_tsquery ------------ 'japan' (1 row) regression=# select to_tsquery('english', 'Europe'); to_tsquery ------------ 'europ' (1 row) If you just cast 'europe' directly to tsquery, which is what's going to happen in the first example, you get the lexeme 'europe' which doesn't match 'europ'. regards, tom lane