Re: [GENERAL] Fragments in tsearch2 headline - Mailing list pgsql-hackers

From Sushant Sinha
Subject Re: [GENERAL] Fragments in tsearch2 headline
Date
Msg-id 1217692218.6000.8.camel@dragflick
Whole thread Raw
In response to Re: [GENERAL] Fragments in tsearch2 headline  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: [GENERAL] Fragments in tsearch2 headline  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Sorry for the delay. Here is the patch with FragmentDelimiter option.
It requires an extra option in HeadlineParsedText and uses that option
during generateHeadline.

Implementing notion of fragments in HeadlineParsedText and a separate
function to join them seems more complicated. So for the time being I
just dump a FragmentDelimiter whenever a new fragment (other than the
first one) starts.

The patch also contains the updated regression tests/results and also a
new test for FragmentDelimiter option. It also contains the
documentation for the new options.

I have also attached a separate file that tests different aspects of the
new headline generation function.

Let me know if anything else is needed.

-Sushant.

On Thu, 2008-07-24 at 00:28 +0400, Oleg Bartunov wrote:
> On Wed, 23 Jul 2008, Sushant Sinha wrote:
>
> > I guess it is more readable to add cover separator at the end of a fragment
> > than in the front. Let me know what you think and I can update it.
>
> FragmentsDelimiter should *separate* fragments and that says all.
> Not very difficult algorithmic problem, it's like  perl's
> join(FragmentsDelimiter, @array)
>
> >
> > I think the right place for cover separator is in the structure
> > HeadlineParsedText just like startsel and stopsel. This will enable users to
> > specify their own cover separators. But this will require changes to the
> > structure as well as to the generateHeadline function. This option will not
> > also play well with the default headline generation function.
>
> As soon as we introduce FragmentsDelimiter we should make it
> configurable.
>
> >
> > The default MaxWords = 35 seems a bit high for this headline generation
> > function and 20 seems to be more reasonable. Any thoughts?
>
> I think we should not change default value because it could change
> behaviour of existing applications. I'm not sure if it'd be useful and
> possible to define default values in CREATE TEXT SEARCH PARSER
>
> >
> > -Sushant.
> >
> > On Wed, Jul 23, 2008 at 7:44 AM, Oleg Bartunov <oleg@sai.msu.su> wrote:
> >
> >> btw, is it intentional to have '....' in headline ?
> >>
> >> =# select ts_headline('1 2 3 4 5 1 2 3 1','1&4'::tsquery,'MaxFragments=1');
> >>       ts_headline
> >> -------------------------
> >>  ... <b>4</b> 5 <b>1</b>
> >>
> >>
> >>
> >> Oleg
> >>
> >> On Wed, 23 Jul 2008, Teodor Sigaev wrote:
> >>
> >>  Let me know of any other changes that are needed.
> >>>>
> >>>
> >>> Looks like ready to commit, but documentation is needed.
> >>>
> >>>
> >>>
> >>        Regards,
> >>                Oleg
> >> _____________________________________________________________
> >> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> >> Sternberg Astronomical Institute, Moscow University, Russia
> >> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/<http://www.sai.msu.su/%7Emegera/>
> >> phone: +007(495)939-16-83, +007(495)939-23-83
> >>
> >
>
>      Regards,
>          Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Parsing of pg_hba.conf and authentication inconsistencies
Next
From: Tom Lane
Date:
Subject: Re: Re: [Pljava-dev] Should creating a new base type require superuser status?