Bug with Tsearch and tsvector - Mailing list pgsql-bugs

From Donald Fraser
Subject Bug with Tsearch and tsvector
Date
Msg-id E7CE594F0C6149D48DA8D0D9937A4915@DEVELOP1
Whole thread Raw
Responses Re: Bug with Tsearch and tsvector
List pgsql-bugs
PostgreSQL 8.3.10 (on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.=
2 20080704 (Red Hat 4.1.2-46))
OS: Linux Redhat EL 5.4=20
Database encoding: LATIN9

Using the default tsearch configuration, for 'english', text is being wrong=
ly parsed into the tsvector type.=20
The fail condition is shown with the following example, using the ts_headli=
ne function to highlight the issue.

SELECT ts_headline('english', 'The annual financial report will shortly be =
posted on the Company’s web-site at
      <span lang=3D"EN-GB">http://www.harewoodsolutions.co.uk/press.aspx</s=
pan><span lang=3D"EN-GB" style=3D""></span><span style=3D"">
      and a further announcement will be made once the annual financial rep=
ort is available to be downloaded. </span>',
      to_tsquery(''), 'MaxWords=3D101, MinWords=3D100');

Output:
"The annual financial report will shortly be posted on the Company’s =
 web-site at
       http://www.harewoodsolutions.co.uk/press.aspx</span><span lang=3D"EN=
-GB" style=3D"">
      and a further announcement will be made once the annual financial rep=
ort is available to be downloaded.  "=20

Expected output:
"The annual financial report will shortly be posted on the Company’s =
 web-site at
       http://www.harewoodsolutions.co.uk/press.aspx
      and a further announcement will be made once the annual financial rep=
ort is available to be downloaded.  "=20

Regards
Donald Fraser=

pgsql-bugs by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: BUG #5438: Bug/quirk in ascii() function
Next
From: Tom Lane
Date:
Subject: Re: Bug with Tsearch and tsvector