Re: Fragments in tsearch2 headline - Mailing list pgsql-general
From | Oleg Bartunov |
---|---|
Subject | Re: Fragments in tsearch2 headline |
Date | |
Msg-id | Pine.LNX.4.64.0710301834340.14368@sn.sai.msu.ru Whole thread Raw |
In response to | Re: Fragments in tsearch2 headline ("Catalin Marinas" <catalin.marinas@gmail.com>) |
Responses |
Re: Fragments in tsearch2 headline
("Sushant Sinha" <sushant354@gmail.com>)
Re: Fragments in tsearch2 headline ("Catalin Marinas" <catalin.marinas@gmail.com>) |
List | pgsql-general |
On Tue, 30 Oct 2007, Catalin Marinas wrote: > On 30/10/2007, Richard Huxton <dev@archonet.com> wrote: >> Oleg Bartunov wrote: >>> Catalin, >>> >>> what is your need ? What's wrong with this ? >>> >>> postgres=# select ts_headline('1 2 3 4 5 3 4 abc abc 2 3 >>> xyz','2'::tsquery, 'StartSel=...,StopSel=...') >>> ; >>> ts_headline >>> ------------------------------------------- >>> 1 ...2... 3 4 5 3 4 abc abc ...2... 3 xyz >> >> I think he want's something like: "1 2 3 ... abc 2 3 ..." >> >> A few characters of context around each match and then ... between. Kind >> of like grep -C. > > That's pretty much correct (with the difference that I'd like context > of words rather than lines as in "grep" and StartSel=<b>, > StopSel=</b>). > > Since the text I want a headline for might be pretty long (tens of > lines), I'd like to only show the excerpts around the matching words. > Similar to the above example: > > select ts_headline('1 2 3 4 5 3 4 abc x y z 2 3', '2 & abc'::tsquery); > > should give: > > '1 <b>2</b> 3 4 ... 3 4 <b>abc</b> x y' > > Currently, if you limit the maximum words so that 'abc' is too far, it > only highlights the first match. ok, then you have to formalize many things - how long should be excerpts, how much excerpts to show, etc. In tsearch2 we have get_covers() function, which produces all excerpts like: =# select get_covers(to_tsvector('1 2 3 4 5 3 4 abc x y z 2 3'), '2&3'::tsquery); get_covers ------------------------------------------------ 1 {1 2 3 }1 4 5 {2 3 4 abc x y z {3 2 }2 3 }3 (1 row) Once you formalize your requirements, you can look on it and adapt to your needs (and share with people). I think it could be nice contrib module. > > Many of the search engines (including google) show the headline this > way. I think Lucene can do this as well but I've never used it to be > sure. > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
pgsql-general by date: