Re: TS: Limited cover density ranking - Mailing list pgsql-hackers
From | Oleg Bartunov |
---|---|
Subject | Re: TS: Limited cover density ranking |
Date | |
Msg-id | Pine.LNX.4.64.1201282302520.12612@sn.sai.msu.ru Whole thread Raw |
In response to | TS: Limited cover density ranking (karavelov@mail.bg) |
List | pgsql-hackers |
I suggest you work on more general approach, see http://www.sai.msu.su/~megera/wiki/2009-08-12 for example. btw, I don't like you changed ts_rank_cd arguments. Oleg On Fri, 27 Jan 2012, karavelov@mail.bg wrote: > Hello, > > I have developed a variation of cover density ranking functions that counts only covers that are lesser than a specifiedlimit. It is useful for finding combinations of terms that appear nearby one another. Here is an example of usage: > > -- normal cover density ranking : not changed > luben=> select ts_rank_cd(to_tsvector('a b c d e g h i j k'), to_tsquery('a&d')); > ts_rank_cd > ------------ > 0.0333333 > (1 row) > > -- limited to 2 > luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j k'), to_tsquery('a&d')); > ts_rank_cd > ------------ > 0 > (1 row) > > luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d')); > ts_rank_cd > ------------ > 0.1 > (1 row) > > -- limited to 3 > luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j k'), to_tsquery('a&d')); > ts_rank_cd > ------------ > 0.0333333 > (1 row) > > luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d')); > ts_rank_cd > ------------ > 0.133333 > (1 row) > > Find attached a path agains 9.1.2 sources. I preferred to make a patch, not a separate extension because it is only 1 statementchange in calc_rank_cd function. If I have to make an extension a lot of code would be duplicated between backend/utils/adt/tsrank.cand the extension. > > I have some questions: > > 1. Is it interesting to develop it further (documentation, cleanup, etc) for inclusion in one of the next versions? Ifthis is the case, there are some further questions: > > - should I overload ts_rank_cd (as in examples above and the patch) or should I define new set of functions, for examplets_rank_lcd ? > - should I define define this new sql level functions in core or should I go only with this 2 lines change in calc_rank_cd()and define the new functions as an extension? If we prefer the later, could I overload core functions withfunctions defined in extensions? > - and finally there is always the possibility to duplicate the code and make an independent extension. > > 2. If I run the patched version on cluster that was initialized with unpatched server, is there a way to register the newfunctions in the system catalog without reinitializing the cluster? > > Best regards > luben > > -- > Luben Karavelov Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
pgsql-hackers by date: