TS: Limited cover density ranking - Mailing list pgsql-hackers

From karavelov@mail.bg
Subject TS: Limited cover density ranking
Date
Msg-id c4bd0b01f3398372af9572f4913bdb6c.mailbg@mail.bg
Whole thread Raw
Responses Re: TS: Limited cover density ranking  (Sushant Sinha <sushant354@gmail.com>)
Re: TS: Limited cover density ranking  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-hackers
Hello, <br /><br />I have developed a variation of cover density ranking functions that counts only covers that are
lesserthan a specified limit. It is useful for finding combinations of terms that appear nearby one another. Here is an
exampleof usage: <br /><br />-- normal cover density ranking : not changed <br />luben=> select
ts_rank_cd(to_tsvector('ab c d e g h i j k'), to_tsquery('a&d')); <br /> ts_rank_cd <br />------------ <br />
0.0333333<br />(1 row) <br /><br />-- limited to 2 <br />luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j
k'),to_tsquery('a&d')); <br /> ts_rank_cd <br />------------ <br /> 0 <br />(1 row) <br /><br />luben=> select
ts_rank_cd(2,to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d')); <br /> ts_rank_cd <br />------------ <br
/>0.1 <br />(1 row) <br /><br />-- limited to 3 <br />luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j
k'),to_tsquery('a&d')); <br /> ts_rank_cd <br />------------ <br /> 0.0333333 <br />(1 row) <br /><br /> luben=>
selectts_rank_cd(3, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d')); <br /> ts_rank_cd <br
/>------------<br /> 0.133333 <br />(1 row) <br /><br />Find attached a path agains 9.1.2 sources. I preferred to make
apatch, not a separate extension because it is only 1 statement change in calc_rank_cd function. If I have to make an
extensiona lot of code would be duplicated between backend/utils/adt/tsrank.c and the extension. <br /><br />I have
somequestions: <br /><br />1. Is it interesting to develop it further (documentation, cleanup, etc) for inclusion in
oneof the next versions? If this is the case, there are some further questions: <br /><br />- should I overload
ts_rank_cd(as in examples above and the patch) or should I define new set of functions, for example ts_rank_lcd ? <br
/>-should I define define this new sql level functions in core or should I go only with this 2 lines change in
calc_rank_cd()and define the new functions as an extension? If we prefer the later, could I overload core functions
withfunctions defined in extensions? <br />- and finally there is always the possibility to duplicate the code and make
anindependent extension. <br /><br />2. If I run the patched version on cluster that was initialized with unpatched
server,is there a way to register the new functions in the system catalog without reinitializing the cluster? <br /><br
/>Bestregards <br />luben <br /><br />-- <br />Luben Karavelov 

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: patch for parallel pg_dump
Next
From: karavelov@mail.bg
Date:
Subject: Re: TS: Limited cover density ranking