Re: tsearch2, large data and indexes - Mailing list pgsql-performance

From Heikki Linnakangas
Subject Re: tsearch2, large data and indexes
Date
Msg-id 5358F6BE.2020104@vmware.com
Whole thread Raw
In response to Re: tsearch2, large data and indexes  (Sergey Konoplev <gray.ru@gmail.com>)
Responses Re: tsearch2, large data and indexes  (Ivan Voras <ivoras@freebsd.org>)
Re: tsearch2, large data and indexes  (Sergey Konoplev <gray.ru@gmail.com>)
List pgsql-performance
On 04/24/2014 01:56 AM, Sergey Konoplev wrote:
> On Wed, Apr 23, 2014 at 4:08 AM, Ivan Voras <ivoras@freebsd.org> wrote:
>> Ok, I found out what is happening, quoting from the documentation:
>>
>> "GIN indexes are not lossy for standard queries, but their performance
>> depends logarithmically on the number of unique words. (However, GIN
>> indexes store only the words (lexemes) oftsvector values, and not
>> their weight labels. Thus a table row recheck is needed when using a
>> query that involves weights.)"
>>
>> My query doesn't have weights but the tsvector in the table has them -
>> I take it this is what is meant by "involves weights."
>>
>> So... there's really no way for tsearch2 to produce results based on
>> the index alone, without recheck? This is... limiting.
>
> My guess is that you could use strip() function [1] to get rid of
> weights in your table or, that would probably be better, in your index
> only by using expressions in it and in the query, eg.

As the docs say, the GIN index does not store the weights. As such,
there is no need to strip them. A recheck would be necessary if your
query needs the weights, precisely because the weights are not included
in the index.

(In the OP's query, it's the ranking that was causing the detoasting.)

- Heikki


pgsql-performance by date:

Previous
From: Sergey Konoplev
Date:
Subject: Re: tsearch2, large data and indexes
Next
From: Ivan Voras
Date:
Subject: Re: tsearch2, large data and indexes