Home > mailing lists

Re: Bunch of tsearch fixes and cleanup - Mailing list pgsql-patches

From	Heikki Linnakangas
Subject	Re: Bunch of tsearch fixes and cleanup
Date	August 23, 2007 17:30:42
Msg-id	46CDEE4D.906@enterprisedb.com Whole thread Raw
In response to	Re: Bunch of tsearch fixes and cleanup (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Bunch of tsearch fixes and cleanup
List	pgsql-patches

Tree view

Tom Lane wrote:
> Something that was annoying me yesterday was that it was not clear
> whether we had fixed every single place that uses a tsearch config file
> to assume that the file is in UTF8 and should be converted to database
> encoding.  So I was thinking of hardwiring the "recode" part into
> readstopwords, and using wordop just for the "lowercase" part, which
> seemed to me like a saner division of labor.  That is, UTF8 is a policy
> that we want to enforce globally, but lowercasing maybe not, and this
> still leaves the door open for more processing besides lowercasing.

I think we also want to always run input files through pg_verify_mbstr.
We do it for stopwords, and synonym files (though incorrectly), but not
for thesaurus files or ispell files. It's probably best to do that
within the recode-function as well.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

From: Oleg Bartunov
Date: 23 August 2007, 14:54:55
Subject: Re: Bunch of tsearch fixes and cleanup

From: "Heikki Linnakangas"
Date: 24 August 2007, 08:40:29
Subject: Re: Bunch of tsearch fixes and cleanup

Re: Bunch of tsearch fixes and cleanup - Mailing list pgsql-patches

Previous

Next