Another tsearch bug... - Mailing list pgsql-hackers

From Christopher Kings-Lynne
Subject Another tsearch bug...
Date
Msg-id GNELIHDDFBOCMGBFGEFOIENKCDAA.chriskl@familyhealth.com.au
Whole thread Raw
In response to Please, apply patch  (Teodor Sigaev <teodor@stack.net>)
Responses Re: Another tsearch bug...  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-hackers
Hi guys,

Hate to keep coming up with these bugs without patches - but I really don't
have time to look into the source code atm :(

OK, attached is an example of the problem.  Notice how trademarks and
copyright symbols are being indexed along with the word.  This means that if
someone searches for 'balance' in the above data set, they won't find
anything.

I'm not sure how this would be handled.  In the English language, it'd
probably be safe to say that high ascii characters would be stripped from
the index?  But you'd want to leave accents and stuff in I guess.  Tricky.

Anyway, just bringing it to your attention...

Chris

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Proposed GUC Variable
Next
From: "Christopher Kings-Lynne"
Date:
Subject: Re: Proposed GUC Variable