Thread: tsearch2 questions
1. What is the advantage of the tsearch2() trigger? Why can't I write my own trigger which does approximately: UPDATE manuscript set manuscript_vector = setweight(to_tsvector(manuscript_genre), 'A') || setweight(to_tsvector(manuscript_title), 'B') || to_tsvector(manuscript_abstract); 2. Is there a way to know in advance the maximum return value of the rank function? I have lots of other information to include in the goodness-of-match score besides the fulltext match rank so I would prefer a tsearch2 rank score between 0 and 1. Do I need to write my own rank function? -- Make April 15 just another day, visit http://fairtax.org
On Wed, 4 Jul 2007, Joshua N Pritikin wrote: > 1. What is the advantage of the tsearch2() trigger? Why can't I write my > own trigger which does approximately: no advantage, it's just an example. > > UPDATE manuscript set manuscript_vector = > setweight(to_tsvector(manuscript_genre), 'A') || > setweight(to_tsvector(manuscript_title), 'B') || > to_tsvector(manuscript_abstract); > > 2. Is there a way to know in advance the maximum return value of the > rank function? I have lots of other information to include in the > goodness-of-match score besides the fulltext match rank so I would > prefer a tsearch2 rank score between 0 and 1. Do I need to write my own > rank function? what's about simple normalization formulae, like rank/(rank+1) ? Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On Wed, Jul 04, 2007 at 10:59:46AM +0400, Oleg Bartunov wrote: > On Wed, 4 Jul 2007, Joshua N Pritikin wrote: > >1. What is the advantage of the tsearch2() trigger? Why can't I write my > >own trigger which does approximately: > > no advantage, it's just an example. Please mention that in the documentation: tsearch2() trigger used to automatically update vector_column_name, my_filter_name is the function name to preprocess text_column_name. There are can be many functions and text columns specified in tsearch2() trigger. The following rule used: function applied to all subsequent text columns until next function occurs. Example, function dropatsymbol replaces all entries of @ sign by space. tsearch2() is an example. You are welcome to write your own trigger. > >2. Is there a way to know in advance the maximum return value of the > >rank function? I have lots of other information to include in the > >goodness-of-match score besides the fulltext match rank so I would > >prefer a tsearch2 rank score between 0 and 1. Do I need to write my own > >rank function? > > what's about simple normalization formulae, like rank/(rank+1) ? I think you are suggesting that I use the best rank as the denominator for the rank column. Yes, I suppose that will work. Thanks. -- Make April 15 just another day, visit http://fairtax.org
On 7/4/07, Joshua N Pritikin <jpritikin@pobox.com> wrote:
dont you think this is perfeclty clear?
"If you want to do something specific with columns, you may write your very own trigger function using plpgsql or other procedural languages (but not SQL, unfortunately) and use it instead of tsearch2 trigger."
actually oleg supposed not to use best rank, but just use the formula as given - rank/(rank+1) to get rank in range of 0 to 1.
depesz
Please mention that in the documentation:
dont you think this is perfeclty clear?
"If you want to do something specific with columns, you may write your very own trigger function using plpgsql or other procedural languages (but not SQL, unfortunately) and use it instead of tsearch2 trigger."
> what's about simple normalization formulae, like rank/(rank+1) ?
I think you are suggesting that I use the best rank as the denominator
for the rank column. Yes, I suppose that will work.
actually oleg supposed not to use best rank, but just use the formula as given - rank/(rank+1) to get rank in range of 0 to 1.
On Wed, Jul 04, 2007 at 10:40:11AM +0200, hubert depesz lubaczewski wrote: > On 7/4/07, Joshua N Pritikin <jpritikin@pobox.com> wrote: > >Please mention that in the documentation: > > dont you think this is perfeclty clear? > > "If you want to do something specific with columns, you may write your very > own trigger function using plpgsql or other procedural languages (but not > SQL, unfortunately) and use it instead of tsearch2 trigger." From where are you quoting? I was quoting from: http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch2-ref.html > >what's about simple normalization formulae, like rank/(rank+1) ? > >I think you are suggesting that I use the best rank as the denominator > >for the rank column. Yes, I suppose that will work. > > actually oleg supposed not to use best rank, but just use the formula as > given - rank/(rank+1) to get rank in range of 0 to 1. OK, then what does the +1 mean in your formulae? Consider these results from [1]. rank/(rank+1): 0.19/.1 = 1.9, .1/.1 = 1, etc. That doesn't make sense. The reciprocal also doesn't make sense. So what does Oleg mean? I was guessing that Oleg meant to divide the rank column by the first rank, that is, by 0.19 so you would get 1, .52, .52, etc. id | headline | rank ----+-------------------------------------------------------+------ 3 | <b>crawling</b> over cobbles in a low <b>passage</b>. | 0.19 1 | <b>crawl</b> over cobbles leads inward to the west. | 0.1 4 | <b>passages</b> lead east, north, and south. | 0.1 5 | <b>crawl</b> slants up. | 0.1 7 | <b>passage</b> here is blocked by a recent cave-in. | 0.1 Am I being stupid? [1] http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch2-guide.html -- Make April 15 just another day, visit http://fairtax.org
On 7/4/07, Joshua N Pritikin <jpritikin@pobox.com> wrote:
i was quoting file http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
or actually - it's copy provided with sources of postgresql in contrib/tsearch2/docs directory.
+1 means: add one to.
for example: for rank = 0.1 you get: 0.1/(0.1+1) = 0.1/1.1 = 0.0909
for rank = 0.5 you get: 0.5/(0.5+1) = 0.5/1.5 = 0.3333
i think that notation: rank+1 is pretty readable.
additionally - sorry but i dont understand your calculations. what is 0.19/.1 ? how did you get the .1?
depesz
From where are you quoting? I was quoting from:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch2-ref.html
i was quoting file http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
or actually - it's copy provided with sources of postgresql in contrib/tsearch2/docs directory.
> actually oleg supposed not to use best rank, but just use the formula as
> given - rank/(rank+1) to get rank in range of 0 to 1.
OK, then what does the +1 mean in your formulae? Consider these results
from [1]. rank/(rank+1): 0.19/.1 = 1.9, .1/.1 = 1, etc. That doesn't
make sense. The reciprocal also doesn't make sense. So what does Oleg
mean? I was guessing that Oleg meant to divide the rank column by the
first rank, that is, by 0.19 so you would get 1, .52, .52, etc.
+1 means: add one to.
for example: for rank = 0.1 you get: 0.1/(0.1+1) = 0.1/1.1 = 0.0909
for rank = 0.5 you get: 0.5/(0.5+1) = 0.5/1.5 = 0.3333
i think that notation: rank+1 is pretty readable.
additionally - sorry but i dont understand your calculations. what is 0.19/.1 ? how did you get the .1?
depesz
On Wed, Jul 04, 2007 at 11:08:21AM +0200, hubert depesz lubaczewski wrote: > On 7/4/07, Joshua N Pritikin <jpritikin@pobox.com> wrote: > >From where are you quoting? I was quoting from: > > > >http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch2-ref.html > > i was quoting file > http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html So that one is fine. Only the reference could use some clarification. > >actually oleg supposed not to use best rank, but just use the formula as > >> given - rank/(rank+1) to get rank in range of 0 to 1. > >OK, then what does the +1 mean in your formulae? Consider these results > >from [1]. rank/(rank+1): 0.19/.1 = 1.9, .1/.1 = 1, etc. That doesn't > >make sense. The reciprocal also doesn't make sense. So what does Oleg > >mean? I was guessing that Oleg meant to divide the rank column by the > >first rank, that is, by 0.19 so you would get 1, .52, .52, etc. > > +1 means: add one to. > for example: for rank = 0.1 you get: 0.1/(0.1+1) = 0.1/1.1 = 0.0909 > for rank = 0.5 you get: 0.5/(0.5+1) = 0.5/1.5 = 0.3333 D'oh! I see. > i think that notation: rank+1 is pretty readable. > > additionally - sorry but i dont understand your calculations. what is > 0.19/.1 > ? how did you get the .1? I was imagining that "rank+1" was the second row of the rank column. Sorry for the confusion. -- Make April 15 just another day, visit http://fairtax.org