Re: Ellipses around result fragment of ts_headline - Mailing list pgsql-hackers
From | Asher Snyder |
---|---|
Subject | Re: Ellipses around result fragment of ts_headline |
Date | |
Msg-id | 00a001c98eed$bfef8ab0$3fcea010$@com Whole thread Raw |
In response to | Re: Ellipses around result fragment of ts_headline (Sushant Sinha <sushant354@gmail.com>) |
List | pgsql-hackers |
Yes, you are correct in your assumption that I'm looking for a single fragment to also have the option to add a fragment delimiter based on its position in the document. >-----Original Message----- >From: Sushant Sinha [mailto:sushant354@gmail.com] >Sent: Saturday, February 14, 2009 4:41 PM >To: Asher Snyder >Cc: pgsql-hackers@postgresql.org >Subject: RE: [HACKERS] Ellipses around result fragment of ts_headline > >The documentation in 8.4dev has information on FragmentDelimiter >http://developer.postgresql.org/pgdocs/postgres/textsearch-controls.html > >If you do not specify MaxFragments > 0, then the default headline >generator kicks in. The default headline generator does not have any >fragment delimiter. So it is correct that you will not see any >delimiter. > >I think you are looking for the default headline generator to add >ellipses as well depending on where the fragment is. I do not what >other people opinion on this is. > >-Sushant. > >On Sat, 2009-02-14 at 16:21 -0500, Asher Snyder wrote: >> Interesting, it could be that you already do it, but the documentation >makes >> no reference to a fragment delimiter, so there's no way that I can see >to >> add one. The documentation for ts_headline only lists StartSel, >StopSel, >> MaxWords, MinWords, ShortWord, and HighlightAll, there appears to be >no >> option for a fragment delimiter. >> >> In my case I do: >> >> SELECT v1.id, v1.type_id, v1.title, ts_headline(v1.copy, query, >'MinWords = >> 17') as copy, ts_rank(v1.text_search, query) AS rank FROM >> (SELECT b1.*, (setweight(to_tsvector(coalesce(b1.title,'')), 'A') >> || >> setweight(to_tsvector(coalesce(b1.copy,'')), 'B')) as text_search >> FROM search.v_searchable_content b1) v1, >> plainto_tsquery($1) query >> WHERE ($2 IS NULL OR (type_id = ANY($2))) AND query @@ v1.text_search >ORDER >> BY rank DESC, title >> >> Now, this use of ts_headline correctly returns me highlighted >fragmented >> search results, but there will be no fragment delimiter for the >headline. >> Some suggestions were to change ts_headline(v1.copy, query, 'MinWords >= 17') >> to '...' || _headline(v1.copy, query, 'MinWords = 17') || '...', but >as you >> can clearly see this would always occur, and not be intelligent >regarding >> the fragments. I hope that you're correct and that it is implemented, >and >> not documented >> >> >-----Original Message----- >> >From: Sushant Sinha [mailto:sushant354@gmail.com] >> >Sent: Saturday, February 14, 2009 4:07 PM >> >To: Asher Snyder >> >Cc: pgsql-hackers@postgresql.org >> >Subject: Re: [HACKERS] Ellipses around result fragment of ts_headline >> > >> >I think we currently do that. We add ellipses only when we encounter >a >> >new fragment. So there should not be ellipses if we are at the end of >> >the document or if that is the first fragment (includes the beginning >of >> >the document). Here is the code in generateHeadline, ts_parse.c that >> >adds the ellipses: >> > >> > if (!infrag) >> > { >> > >> > /* start of a new fragment */ >> > infrag = 1; >> > numfragments ++; >> > /* add a fragment delimitor if this is after the >first >> >one */ >> > if (numfragments > 1) >> > { >> > memcpy(ptr, prs->fragdelim, prs->fragdelimlen); >> > ptr += prs->fragdelimlen; >> > } >> > >> > } >> > >> >It is possible that there is a bug that needs to be fixed. Can you >show >> >me an example where you found that? >> > >> >-Sushant. >> > >> > >> > >> > >> >On Sat, 2009-02-14 at 15:13 -0500, Asher Snyder wrote: >> >> It would be very useful if there were an option to have ts_headline >> >append >> >> ellipses before or after a result fragement based on the position >of >> >the >> >> fragment in the source document. For instance, when running >> >ts_headline(doc, >> >> query) it will correctly return a fragment with words highlighted, >> >however, >> >> there's no easy way to determine whether this returned fragment is >at >> >the >> >> beginning or end of the original doc, and add the necessary >ellipses. >> >> >> >> Searches such as postgresql.org ALWAYS add ellipses before or after >> >the >> >> fragment regardless of whether or not ellipses are warranted. In my >> >opinion >> >> always adding ellipses to the fragment is deceptive to the user, in >> >many of >> >> my search result cases, the fragment is at the beginning of the >doc, >> >and >> >> would confuse the user to always see ellipses. So you can see how >> >useful the >> >> feature described above would be beneficial to the accuracy of the >> >search >> >> result fragment. >> >> >> >> >> >> >> >> >> >> >> >>
pgsql-hackers by date: