Re: default_text_search_config and expression indexes - Mailing list pgsql-hackers
From | Oleg Bartunov |
---|---|
Subject | Re: default_text_search_config and expression indexes |
Date | |
Msg-id | Pine.LNX.4.64.0707312219070.18739@sn.sai.msu.ru Whole thread Raw |
In response to | Re: default_text_search_config and expression indexes (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: default_text_search_config and expression
indexes
|
List | pgsql-hackers |
On Mon, 30 Jul 2007, Bruce Momjian wrote: > Bruce Momjian wrote: >> We have to decide if we want a GUC default_text_search_config, and if so >> when can it be changed. >> >> Right now there are three ways to create a tsvector (or tsquery) >> >> ::tsvector >> to_tsvector(value) >> to_tsvector(config, value) >> >> (ignoring plainto_tsvector) >> >> Only the last one specifies the configuration. The others use the >> configuration specified by default_text_search_config. (We had an >> previous discussion on what the default value of >> default_text_search_config should be, and it was decided it should be >> set via initdb based on a flag or the locale.) >> >> Now, because most people use a single configuration, they can just set >> default_text_search_config and there is no need to specify the >> configuration name. >> >> However, expression indexes cause a problem here: >> >> http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX >> >> We recommend that users create an expression index on the column they >> want to do a full text search on, e.g. >> >> CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body)); >> >> However, the big problem is that the expressions used in expression >> indexes should not change their output based on the value of a GUC >> variable (because it would corrupt the index), but in the case above, >> default_text_search_config controls what configuration is used, and >> hence the output of to_tsvector is changed if default_text_search_config >> changes. >> >> We have a few possible options: >> >> 1) Document the problem and do nothing else. >> 2) Make default_text_search_config a postgresql.conf-only >> setting, thereby making it impossible to change by non-super >> users, or make it a super-user-only setting. >> 3) Remove default_text_search_config and require the >> configuration to be specified in each function call. >> >> If we remove default_text_search_config, it would also make ::tsvector >> casting useless as well. > > OK, I just found a case that I think is going to make #3 a requirement > (remove default_text_search_config). > > How is a CREATE INDEX ... to_tsvector(col) going to restore from a > pg_dump? I see no way of guaranteeing that the > default_text_search_config is correct on the restore, and in fact I > don't think we have any way of knowing the default_text_search_config > used for the index. > > And if we have to require the configuration name in CREATE INDEX, it has > to be used in WHERE, so we might as well just remove the default > capability and always require the configuration name. this is very rare use case for text searching 1. expression index without configuration name 2. default_text_search_config can be changed by somebody If somebody really need it, then he should be adviced to use configuration name, else we don't guarantee that somebody could change default_text_search_config variable and this could lead to incorrect dump/restore. I don't think we should remove default_text_search_config because of this rare case. Regards, Oleg PS. Bruce, I'm in the mountains the Northern Caucasia and internet is a bit unreliable :( _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
pgsql-hackers by date: