Thread: tsearch2: How to use different configurations for two columns?
Hi all, is it possible to index 2 columns in a table with tsearch2 using different configurations for each column index? I have a table publications that is defined as follows (simplified): CREATE TABLE publications ( title text, author_list text, fti_title tsvector, fti_author_list tsvector, ) WITHOUT OIDS; CREATE INDEX idx_fti_author_list ON publications USING gist (fti_author_list); CREATE INDEX idx_fti_title ON publications USING gist (fti_title); CREATE TRIGGER tsvectorupdate_title BEFORE INSERT OR UPDATE ON publications FOR EACH ROW EXECUTE PROCEDURE tsearch2('fti_title', 'title'); CREATE TRIGGER tsvectorupdate_author_list BEFORE INSERT OR UPDATE ON publications FOR EACH ROW EXECUTE PROCEDURE tsearch2('fti_author_list', 'author_list'); The column 'author_list' contains names of authors with many abbreviated first names, e. g. "S. Vollmer Michael F. Smith". These abbreviated first names "S." and "F." shouldn't be indexed. To do this, I created a tsearch2 configuration 'authors' that uses a stopword list with "a...z" as stopwords. The configuration seems to work fine, but I can't get tsearch2 to use the 'author' config for column 'author_list' and 'default' config for column 'title'. Is there any way to accomplish this? If the solution is not possible or too complicated, as a workaround I could use a function that deletes the abbreviated first names before the column is indexed - similar to the function "dropatsymbol()" mentioned in the tsearch2 documentation. I tried to use replace(), but I'd need a function with regexps. Thanks in advance, - Stephan
Attachment
> > The configuration seems to work fine, but I can't get tsearch2 to > use the 'author' config for column 'author_list' and 'default' > config for column 'title'. Is there any way to accomplish this? > It's totally possible. I don't think you can accomplish this using the simple tsearch triggers as you declared. The triggers fire using the configuration set by curcfg() or will search for the one matching your locale. You could write a one trigger for the table to handle both. So you would have one custom function that updates both tsvector columns with different configurations. Andy
> You could write a one trigger for the table to handle both. > Something like this : ----- CREATE OR REPLACE FUNCTION multi_tsearch2() RETURNS TRIGGER AS ' DECLARE BEGIN NEW.fti_title = to_tsvector(''default'', NEW.title); NEW.fti_author_list = to_tsvector(''simple'', NEW.author_list); RETURN NEW; END; ' LANGUAGE 'PLPGSQL'; CREATE TRIGGER tsvectorupdate_all BEFORE INSERT OR UPDATE ON publications FOR EACH ROW EXECUTE PROCEDURE multi_tsearch2(); ---- You can modify the function to be slightly more configurable with parameters taking the column names, and config names and make it a little more reusable. You can accomplish what you want though. Andy
[Andrew, sorry for the PM - I pressed the wrong button.] Andrew J. Kopciuch wrote: >>You could write a one trigger for the table to handle both. > > Something like this : [...] > You can modify the function to be slightly more configurable with parameters > taking the column names, and config names and make it a little more reusable. > You can accomplish what you want though. I'm a total beginner in regards to pgsql functions, so thanks a lot for your example. When I have some time tomorrow, I'll try it out and also take a look at the PL/pgSQL docs. Thanks for your help, - Stephan
Attachment
Andrew J. Kopciuch wrote: >>You could write a one trigger for the table to handle both. > > Something like this : [...] > CREATE OR REPLACE FUNCTION multi_tsearch2() RETURNS TRIGGER AS ' [...] I tested the function today and it's exactly what I had in mind. Works perfectly! Thanks, Stephan