Thread: TSEARCH2: disable stemming in indexes and triggers
Hi all, I installed TSEARCH2 on Postgres8.1 (Debian). It runs all fine. I have 2 tables indexed, and created triggers to keep the vectorcolumns up to date. However, the text I indexed is a mix of Dutch and English and German. The default stemmingprocess is an annoyance for me. I would like to disable it. I found out that using 'simple' instead of 'default' when using to_tsvector() does excactly that, but I don't know how to change my triggers and indexes to keep doing the same (using 'simple'). My TSEARCH2 and trigger-skills are developing, but I lack the confidence to change thing on the site (which is live), so I thought I check in here. I have a table named tblvacature, that contains a bunch of columns that need to be indexed by TSEARCH. This is what I did (and it works, except for the fact I want to disable stemming) 1) ALTER TABLE tblvacature ADD COLUMN idxFTI tsvector; 2) UPDATE tblvacature SET idxFTI=to_tsvector('simple', coalesce(title,'') ||' '|| coalesce(shortintro,'') ||' '|| coalesce(werkgever,'') ||' '|| coalesce(vacaturesteller,'') ||' '|| coalesce(standplaats,'') ||' '|| coalesce(divafdelingwerkgever,'') ); 3) VACUUM FULL ANALYZE; 4) CREATE INDEX idxFTIvacture_idx ON tblvacature USING gist(idxFTI); 5) VACUUM FULL ANALYZE; 6) CREATE TRIGGER tsvectorupdate_vacature BEFORE UPDATE OR INSERT ON tblvacature FOR EACH ROW EXECUTE PROCEDURE tsearch2(idxFTI, title, shortintro, werkgever, vacaturesteller, standplaats, divafdelingwerkgever); Step 2 worked fine: no stemming. But how do I cange my step6 (trigger) so it keeps using 'simple'? Or do I have to change the index itself (step4) too? Or both? Please advise. Thanks for your time. Regards, Erwin Moller ----------------------- PS: This is my first posting to pgsql-general: if I screw something up, please let me know.
On Thu, 31 May 2007, Erwin Moller wrote: > Hi all, > > I installed TSEARCH2 on Postgres8.1 (Debian). > It runs all fine. > I have 2 tables indexed, and created triggers to keep the vectorcolumns > up > to date. > > > However, the text I indexed is a mix of Dutch and English and German. > The default stemmingprocess is an annoyance for me. > I would like to disable it. > I found out that using 'simple' instead of 'default' when using > to_tsvector() does excactly that, but I don't know how to change my > triggers and indexes to keep doing the same (using 'simple'). > > > My TSEARCH2 and trigger-skills are developing, but I lack the confidence > to > change thing on the site (which is live), so I thought I check in here. > > > I have a table named tblvacature, that contains a bunch of columns that > need > to be indexed by TSEARCH. > > > This is what I did (and it works, except for the fact I want to disable > stemming) > > > 1) ALTER TABLE tblvacature ADD COLUMN idxFTI tsvector; > 2) UPDATE tblvacature SET idxFTI=to_tsvector('simple', > coalesce(title,'') ||' '|| > coalesce(shortintro,'') ||' '|| > coalesce(werkgever,'') ||' '|| > coalesce(vacaturesteller,'') ||' '|| > coalesce(standplaats,'') ||' '|| > coalesce(divafdelingwerkgever,'') ); > > > 3) VACUUM FULL ANALYZE; > > > 4) CREATE INDEX idxFTIvacture_idx ON tblvacature USING gist(idxFTI); > > > 5) VACUUM FULL ANALYZE; > > > 6) CREATE TRIGGER tsvectorupdate_vacature BEFORE UPDATE OR INSERT ON > tblvacature FOR EACH ROW EXECUTE > PROCEDURE tsearch2(idxFTI, title, shortintro, werkgever, > vacaturesteller, > standplaats, divafdelingwerkgever); > > > Step 2 worked fine: no stemming. > But how do I cange my step6 (trigger) so it keeps using 'simple'? > Or do I have to change the index itself (step4) too? Or both? > > > Please advise. tsearch trigger is just an example, you could write your own function ! for example, create function my_update() returns trigger as $$ BEGIN NEW.idxFTI=to_tsvector('simple', coalesce(title,'') ||' '|| coalesce(shortintro,'') ||' '|| coalesce(werkgever,'') ||' '|| coalesce(vacaturesteller,'') ||' '|| coalesce(standplaats,'') ||' '|| coalesce(divafdelingwerkgever,'') ); RETURN NEW; END; $$ language plpgsql; btw, why do you don't use setweight() function to assign different weights to the different attributes ? > > > Thanks for your time. > > > Regards, > Erwin Moller > > ----------------------- > PS: This is my first posting to pgsql-general: if I screw something up, > please let me know. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
> I found out that using 'simple' instead of 'default' when using > to_tsvector() does excactly that, but I don't know how to change my > triggers and indexes to keep doing the same (using 'simple'). Suppose, your database is initialized with C locale. So, just mark simple configuration as default: # update pg_ts_cfg set locale=null where ts_name='default'; # update pg_ts_cfg set locale='C' where ts_name='simple'; If your locale setting is not C then mark needed configuration with your locale.
On Thu, 2007-05-31 at 20:27, Teodor Sigaev wrote: > > I found out that using 'simple' instead of 'default' when using > > to_tsvector() does excactly that, but I don't know how to change my > > triggers and indexes to keep doing the same (using 'simple'). > > Suppose, your database is initialized with C locale. So, just mark > simple configuration as default: > > # update pg_ts_cfg set locale=null where ts_name='default'; > # update pg_ts_cfg set locale='C' where ts_name='simple'; Thanks Teodor. That did the trick. :-) At first I didn't see a change, but after reconnecting to the database it worked for some reason beyound my meager knowledge. Thanks to Oleg Bartunov too for his suggestion about writing my own procedure. This solution seemed simpler and worked the first time right away. Thanks for your help! Regards, Erwin Moller > > If your locale setting is not C then mark needed configuration with your > locale. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ >