Re: tsearch2 dictionary that indexes substrings? - Mailing list pgsql-general

From Tilmann Singer
Subject Re: tsearch2 dictionary that indexes substrings?
Date
Msg-id 20070420102151.GA15825@tils.net
Whole thread Raw
In response to Re: tsearch2 dictionary that indexes substrings?  (Listmail <lists@peufeu.com>)
Responses Re: tsearch2 dictionary that indexes substrings?  (Listmail <lists@peufeu.com>)
List pgsql-general
* Listmail <lists@peufeu.com> [20070420 11:25]:
>     You want trigram based search.
>     ie.
>
>     postgresql -> 'pos', 'ost', 'stg', 'tgr', 'gre', 'res',
>     'esq', 'sql'
>
>     searching for 'gresq' is searching for  'gre' and 'res' and
>     'esq' which  is good friends with bitmap scan. Then a little LIKE
> '%gresq%' to filter  the results.

I'm not sure how that would fit in with tsearch2 to do full text
search so that I can do queries like

select * from content where plainto_tsquery(:q) @@ to_tsvector(body)

If the possible substrings were already indexed like Oleg suggested in
his reply through writing a custom C dictionary, a query like above
with q='foo' would find rows from the table content where body
contains 'foobar' for instance.

However I've seen the example to create a trigram index on a unique
word list to provide alternative spelling suggestions to the user
which looked very useful.

>     PS : indexing all substring means for long words you get huge
>     number of  lexems...

I'm aware of that and in my case I don't think it will be a
problem. It is for a type-ahead search web interface so actually it
only requires indexing all possible substrings starting from char 1,
ie. p, po, pos, post, postg, postgr, postgre, postgres, postgresq,
postgresql.


Til

pgsql-general by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: tsearch2 dictionary that indexes substrings?
Next
From: "Mike Frysinger"
Date:
Subject: Re: cant get pg_dump/pg_restore to behave