Re: BUG #6654: Full text search doesn't find europe - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #6654: Full text search doesn't find europe
Date
Msg-id 9098.1337641317@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #6654: Full text search doesn't find europe  (Andres Freund <andres@anarazel.de>)
List pgsql-bugs
Andres Freund <andres@anarazel.de> writes:
> On Monday, May 21, 2012 07:26:38 PM wbrana@gmail.com wrote:
>> CREATE INDEX idx_post_text ON posts USING gin
>> (to_tsvector('english'::regconfig, post_text::text))
>> select *  from v_search WHERE to_tsvector('english', post_text) @@ 'europe'
>> returns no rows, but
>> select *  from v_search WHERE to_tsvector('english', post_text) @@ 'japan'
>> returns row with "Japan and Europe"

> The problem is that youre using to_tsvector('english' for parsing the text but
> don't specify the text yearch configuration for the query. The default english
> configuration does stemming, the default_text_search_configuration obviously
> not.
> Try ... to_tsvector('english', post_text) @@ to_tsquery('english', 'europe')

BTW, a good way to debug this sort of issue is to look at the actual
tsvector and tsquery values.

regression=# select to_tsvector('english', 'Japan and Europe');
     to_tsvector
---------------------
 'europ':3 'japan':1
(1 row)

regression=# select to_tsquery('english', 'Japan');
 to_tsquery
------------
 'japan'
(1 row)

regression=# select to_tsquery('english', 'Europe');
 to_tsquery
------------
 'europ'
(1 row)

If you just cast 'europe' directly to tsquery, which is what's going to
happen in the first example, you get the lexeme 'europe' which doesn't
match 'europ'.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #6654: Full text search doesn't find europe
Next
From: Tom Lane
Date:
Subject: Re: BUG #6656: Wrong timestamptz + interval calculation