Thread: tsearch2: How to use different configurations for two columns?

tsearch2: How to use different configurations for two columns?

From
Stephan Vollmer
Date:
Hi all,

is it possible to index 2 columns in a table with tsearch2 using
different configurations for each column index?

I have a table publications that is defined as follows (simplified):

CREATE TABLE publications
(
  title           text,
  author_list     text,
  fti_title       tsvector,
  fti_author_list tsvector,
)
WITHOUT OIDS;

CREATE INDEX idx_fti_author_list ON publications
  USING gist (fti_author_list);

CREATE INDEX idx_fti_title ON publications
  USING gist (fti_title);

CREATE TRIGGER tsvectorupdate_title
  BEFORE INSERT OR UPDATE ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE tsearch2('fti_title', 'title');

CREATE TRIGGER tsvectorupdate_author_list
  BEFORE INSERT OR UPDATE ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE tsearch2('fti_author_list', 'author_list');

The column 'author_list' contains names of authors with many
abbreviated first names, e. g. "S. Vollmer  Michael F. Smith". These
abbreviated first names "S." and "F." shouldn't be indexed. To do
this, I created a tsearch2 configuration 'authors' that uses a
stopword list with "a...z" as stopwords.

The configuration seems to work fine, but I can't get tsearch2 to
use the 'author' config for column 'author_list' and 'default'
config for column 'title'. Is there any way to accomplish this?


If the solution is not possible or too complicated, as a workaround
I could use a function that deletes the abbreviated first names
before the column is indexed - similar to the function
"dropatsymbol()" mentioned in the tsearch2 documentation. I tried to
use replace(), but I'd need a function with regexps.

Thanks in advance,

- Stephan

Attachment

Re: tsearch2: How to use different configurations for two columns?

From
"Andrew J. Kopciuch"
Date:
>
> The configuration seems to work fine, but I can't get tsearch2 to
> use the 'author' config for column 'author_list' and 'default'
> config for column 'title'. Is there any way to accomplish this?
>

It's totally possible.  I don't think you can accomplish this using the simple
tsearch triggers as you declared.  The triggers fire using the configuration
set by curcfg() or will search for the one matching your locale.

You could write a one trigger for the table to handle both.

So you would have one custom function that updates both tsvector columns with
different configurations.


Andy

Re: tsearch2: How to use different configurations for two columns?

From
"Andrew J. Kopciuch"
Date:
> You could write a one trigger for the table to handle both.
>

Something like this :


-----

CREATE OR REPLACE FUNCTION multi_tsearch2() RETURNS TRIGGER AS '
DECLARE
BEGIN
    NEW.fti_title = to_tsvector(''default'', NEW.title);
    NEW.fti_author_list = to_tsvector(''simple'', NEW.author_list);
    RETURN NEW;
END;
' LANGUAGE 'PLPGSQL';

CREATE TRIGGER tsvectorupdate_all
  BEFORE INSERT OR UPDATE ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE multi_tsearch2();

----


You can modify the function to be slightly more configurable with parameters
taking the column names, and config names and make it a little more reusable.
You can accomplish what you want though.



Andy

Re: tsearch2: How to use different configurations for two

From
Stephan Vollmer
Date:
[Andrew, sorry for the PM - I pressed the wrong button.]

Andrew J. Kopciuch wrote:
>>You could write a one trigger for the table to handle both.
>
> Something like this :

[...]

> You can modify the function to be slightly more configurable with parameters
> taking the column names, and config names and make it a little more reusable.
> You can accomplish what you want though.

I'm a total beginner in regards to pgsql functions, so thanks a lot
for your example. When I have some time tomorrow, I'll try it out
and also take a look at the PL/pgSQL docs.

Thanks for your help,

- Stephan

Attachment

Re: tsearch2: How to use different configurations for two

From
Stephan Vollmer
Date:
Andrew J. Kopciuch wrote:
>>You could write a one trigger for the table to handle both.
>
> Something like this :
[...]
> CREATE OR REPLACE FUNCTION multi_tsearch2() RETURNS TRIGGER AS '
[...]
I tested the function today and it's exactly what I had in mind.
Works perfectly!

Thanks, Stephan

Attachment