Re: ts_headline - Mailing list pgsql-general

From Stephen Davies
Subject Re: ts_headline
Date
Msg-id 200802221037.33055.scldad@sdc.com.au
Whole thread Raw
In response to Re: ts_headline  (Richard Huxton <dev@archonet.com>)
Responses Re: ts_headline
List pgsql-general
OK. The first level explanation is that my default config is "simple".
This explains the different query results as "english" reduces "database" to
"databas" while "simple does not reduce it at all.

The "document" is parsed/indexed using "english" explicitly so my queries nedd
to be explicit also (not an issue as all "real" queries are generated rather
than typed).

However, I still cannot see a reason for the ts_headline results. If anything,
they should be the other way around.

I suspect that ts_headline may only work properly when no configuration is
specified - regardless of the default setting.

Cheers,
Stephen

On Thursday 21 February 2008 22:30, Richard Huxton wrote:
> Stephen Davies wrote:
> > I just spotted the difference between your test and mine.
> >
> > My query says:
> >
> > select ts_headline(abstract,to_tsquery('english','database'),'minWords =
> > 99, maxWords = 999') from document where id=21;
> >
> > where your equivalent does not include the 'english' arg.
> >
> > If I take out the 'english' from this query, I get the same result as
> > you.
>
> What does this give you:
>    show default_text_search_config;
> I get pg_catalog.english and the same result for the query whether I use:
>     to_tsquery('english','database')
> or to_tsquery('pg_catalog.english','database')
>
> Could you be picking up a bad "english" configuration (see \dF)?
>
> > However, the following returns zero rows:
> >
> > select title,author,ts_headline(abstract,to_tsquery('database') from
> > document where clob @@ to_tsquery('database')
>
> I take it "clob" matches "abstract"?
>
> > It gets more interesting:
> >
> > select title,author,ts_headline(abstract,to_tsquery('database') from
> > document where clob @@ to_tsquery('english','database')
> >
> > returns the "correct" result - one row with the expected headline.
>
> Now that *is* strange. ts_headline() works without specifying 'english'
> but the actual search works the other way.
>
> > select
> > title,author,ts_headline(abstract,to_tsquery('english','thesaurus') from
> > document where clob @@ to_tsquery('english','thesaurus')
> >
> > also returns the "correct" result.
> >
> > I suggest that the above indicates a bug somewhere.
>
> Could be - it'd be good to rule out a bad config. You might have an
> unexpected list of stopwords or similar.
>
> Let's try:
>   SELECT ts_debug('the database and thesaurus');
>   SELECT ts_debug('english', 'the database and thesaurus');
>   SELECT ts_debug('pg_catalog.english', 'the database and thesaurus');
> I'd expect "the", "and" to be stripped out as stopwords and the other
> two to get through (database stemmed to "databas").

--
========================================================================
This email is for the person(s) identified above, and is confidential to
the sender and the person(s).  No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies Consulting                            Voice: 08-8177 1595
Adelaide, South Australia.                             Fax: 08-8177 0133
Computing & Network solutions.                       Mobile:0403 0405 83

pgsql-general by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Postgres 8.3 broke everything
Next
From: "Alex Turner"
Date:
Subject: Re: Postgres 8.3 broke everything