Re: text search: restricting the number of parsed words in headline generation - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: text search: restricting the number of parsed words in headline generation
Date
Msg-id 1314126653-sup-3641@alvh.no-ip.org
Whole thread Raw
In response to Re: text search: restricting the number of parsed words in headline generation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Excerpts from Tom Lane's message of mar ago 23 15:59:18 -0300 2011:
> Sushant Sinha <sushant354@gmail.com> writes:
> > Given a document and a query, the goal of headline generation is to
> > produce text excerpts in which the query appears.
> 
> ... right ...
> 
> > Here is a simple patch that limits the number of words during the
> > tokenization phase and puts an upper-bound on the headline generation.
> 
> Doesn't this force the headline to be taken from the first N words of
> the document, independent of where the match was?  That seems rather
> unworkable, or at least unhelpful.

Yeah ...

Doesn't a search result include the position on which the tokens were
found within the document?  Wouldn't it make more sense to improve the
system somehow so that it can restrict searching for headlines in the
general area where the tokens were found?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Getting rid of pg_pltemplate
Next
From: Tom Lane
Date:
Subject: Re: Getting rid of pg_pltemplate