Re: First implementation of GIN for pg_trgm - Mailing list pgsql-patches

From Oleg Bartunov
Subject Re: First implementation of GIN for pg_trgm
Date
Msg-id Pine.LNX.4.64.0702221946050.400@sn.sai.msu.ru
Whole thread Raw
In response to Re: First implementation of GIN for pg_trgm  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Responses Re: First implementation of GIN for pg_trgm  ("Guillaume Smet" <guillaume.smet@gmail.com>)
List pgsql-patches
On Thu, 22 Feb 2007, Guillaume Smet wrote:

> On 2/22/07, Teodor Sigaev <teodor@sigaev.ru> wrote:
>> How long is average length of strings in table?
>
> test=# SELECT MIN(length(word)), MAX(length(word)), AVG(length(word))
> FROM lieu_mots_gin;
> min | max |        avg
> -----+-----+--------------------
>  1 |  38 | 7.4615463141373282
> (1 row)
>
> I don't see how to have a more precise similarity without having the
> number of entries registered by the indexed value somewhere.
>
> I think it can be interesting for other flavours of GIN usage. Is
> there a way to add the number of entries of the considered indexed
> item to the consistent prototype without adding too much overhead and
> complexity?

You're right, it would be nice.
This is what we need for faster ranking in tsearch2, since currently we should
consult heap to get positional information, which slowdowns search.
We didn't investigate the possibility to keep additional information with
index, but keep in mind, that everything should works without index.

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

pgsql-patches by date:

Previous
From: "Guillaume Smet"
Date:
Subject: Re: First implementation of GIN for pg_trgm
Next
From: Teodor Sigaev
Date:
Subject: Re: First implementation of GIN for pg_trgm