ts_headline and query with hyphen - Mailing list pgsql-general

From daniel
Subject ts_headline and query with hyphen
Date
Msg-id 50BEC017.7070507@gmail.com
Whole thread Raw
Responses Re: ts_headline and query with hyphen
List pgsql-general
Hi

I have a question about ts_headline, when the query includes word like
'on-line' - only the 'line' part is highlighted, even though the whole
phrase is indexed too, some details below.

Postgresql 9.1.6

select
token, dictionary, lexemes
from
ts_debug('play on-line') where alias <> 'blank';

   token  |  dictionary  | lexemes
---------+--------------+----------
  play    | english_stem | {play}
  on-line | english_stem | {on-lin}
  on      | english_stem | {}
  line    | english_stem | {line}


select to_tsquery('play & on-line');
          to_tsquery
----------------------------
  'play' & 'on-lin' & 'line'


select ts_headline('play on-line', to_tsquery('play & on-line'));

         ts_headline
----------------------------
  <b>play</b> on-<b>line</b>

Same as

select ts_headline('play on-line', to_tsquery('play & line'));
         ts_headline
----------------------------
  <b>play</b> on-<b>line</b>

Is that the intended behaviour? I guess the problem here is that 'on' is
not a lexem, but then what about on-lin?

In another example, I thought that a hyphenated match would have some
kind of preference

select token, dictionary, lexemes from ts_debug('custom-built query')
where alias <> 'blank';
     token     |  dictionary  |    lexemes
--------------+--------------+----------------
  custom-built | english_stem | {custom-built}
  custom       | english_stem | {custom}
  built        | english_stem | {built}
  query        | english_stem | {queri}


select to_tsquery('query & custom-built');
                   to_tsquery
-----------------------------------------------
  'queri' & 'custom-built' & 'custom' & 'built'


select ts_headline('custom-built query', to_tsquery('query &
custom-built'));
                ts_headline
-----------------------------------------
  <b>custom</b>-<b>built</b> <b>query</b>


This works better, but still both parts of 'custom-built' are
highlighted separately. But maybe ts_headline understands or operates on
single, not hyphenated words only?

thanks
daniel



pgsql-general by date:

Previous
From: "Gauthier, Dave"
Date:
Subject: how do I grant select to one user for all tables in a DB?
Next
From: Tom Lane
Date:
Subject: Re: ts_headline and query with hyphen