Thread: tsearch2 punctuation question
For example: select to_tsvector('cat,dog apple/orange'); to_tsvector ---------------------------------- 'cat':1 'dog':2 'apple/orange':3 (1 row) Is there a setting that allows me to specify that strings containing the '/' should be parsed into separate words? As is, I can't find 'apple' or 'orange'. Thanks, John John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL
On Thu, 26 Apr 2007, John DeSoi wrote: > For example: > > > select to_tsvector('cat,dog apple/orange'); > > to_tsvector > ---------------------------------- > 'cat':1 'dog':2 'apple/orange':3 > (1 row) > > > Is there a setting that allows me to specify that strings containing the '/' > should be parsed into separate words? As is, I can't find 'apple' or > 'orange'. There is no such settings. You can write your parser or dictionary for 'file' token type. We have howto, see http://mira.sai.msu.su/~megera/pgsql/ftsdoc/appendixes.html If you want simple parser, probable better to write one. Probably, the simple way is to write dictionary, which will return {apple/orange, apple,orange}. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > Is there a setting that allows me to specify that strings containing > the '/' should be parsed into separate words? As is, I can't find > 'apple' or 'orange'. No setting, I think you would have to mess with tsearch2 dictionaries. A far easier approach is to have your application simply split the words apart, or even write a wrapper function to do it for you within Postgres, e.g. CREATE OR REPLACE FUNCTION wordsplit(text) RETURNS text LANGUAGE plperl AS $_$ my $string = shift; $string =~ s/\W/ /g; return $string; $_$; SELECT to_tsvector(wordsplit('cat,dog apple/orange')); - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200704261140 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFGMMikvJuQZxSWSsgRAwq4AKDJv4D6UDKZngU2vZt+cPgr6gGsnwCgmJET arG3n5+2pXxR+wedZ2LjZYU= =BPs4 -----END PGP SIGNATURE-----