default_text_search_config and expression indexes - Mailing list pgsql-hackers

From Bruce Momjian
Subject default_text_search_config and expression indexes
Date
Msg-id 200707262223.l6QMNpo23400@momjian.us
Whole thread Raw
In response to Re: How does the tsearch configuration get selected?  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: default_text_search_config and expression indexes  ("Pavel Stehule" <pavel.stehule@gmail.com>)
Re: default_text_search_config and expression indexes  (Magnus Hagander <magnus@hagander.net>)
Re: default_text_search_config and expression indexes  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Oleg Bartunov wrote:
> >> Second, I can't figure out how to reference a non-default
> >> configuration.
> >
> > See the multi-argument versions of to_tsvector etc.
> >
> > I do see a problem with having to_tsvector(config, text) plus
> > to_tsvector(text) where the latter implicitly references a config
> > selected by a GUC variable: how can you tell whether a query using the
> > latter matches a particular index using the former?  There isn't
> > anything in the current planner mechanisms that would make that work.
> 
> Probably, having default text search configuration is not a good idea
> and we could just require it as a mandatory parameter, which could
> eliminate many confusion with selecting text search configuration.

We have to decide if we want a GUC default_text_search_config, and if so
when can it be changed.

Right now there are three ways to create a tsvector (or tsquery)
::tsvectorto_tsvector(value)to_tsvector(config, value)

(ignoring plainto_tsvector)

Only the last one specifies the configuration. The others use the
configuration specified by default_text_search_config.  (We had an
previous discussion on what the default value of
default_text_search_config should be, and it was decided it should be
set via initdb based on a flag or the locale.)

Now, because most people use a single configuration, they can just set
default_text_search_config and there is no need to specify the
configuration name.

However, expression indexes cause a problem here:
http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX

We recommend that users create an expression index on the column they
want to do a full text search on, e.g.
CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body));

However, the big problem is that the expressions used in expression
indexes should not change their output based on the value of a GUC
variable (because it would corrupt the index), but in the case above,
default_text_search_config controls what configuration is used, and
hence the output of to_tsvector is changed if default_text_search_config
changes.

We have a few possible options:
1) Document the problem and do nothing else.2) Make default_text_search_config a postgresql.conf-only   setting,
therebymaking it impossible to change by non-super   users, or make it a super-user-only setting.3) Remove
default_text_search_configand require the   configuration to be specified in each function call.
 

If we remove default_text_search_config, it would also make ::tsvector
casting useless as well.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Dave Page
Date:
Subject: Re: stats_block_level
Next
From: Tom Lane
Date:
Subject: Re: stats_block_level