Home > mailing lists

Re: n-gram search function - Mailing list pgsql-hackers

From	Guillaume Smet
Subject	Re: n-gram search function
Date	February 19, 2007 06:29:22
Msg-id	1d4e0c10702190229k36a2e1bbi6398899f810113bb@mail.gmail.com Whole thread Raw
In response to	Re: n-gram search function (Oleg Bartunov <oleg@sai.msu.su>)
Responses	Re: n-gram search function
List	pgsql-hackers

Tree view

On 2/19/07, Oleg Bartunov <oleg@sai.msu.su> wrote:
> pg_trgm was developed for spelling corrrection and there is a threshold of
> similarity, which is 0.3 by default. Readme explains what does it means.

Yes, I read it.

> Similarity could be very low, since you didn't make separate column and length
> of the full string is used to normalize similarity.

Yep, that's probably my problem. Ignored records are a bit longer than
the others.

I tried the tip in README.pg_trgm to generate a table with all the words.

It can do the work in conjunction of tsearch2 and a bit of AJAX to
suggest the full words to the users. The reason why I was not using
tsearch2 is that it's sometimes hard to spell location names
correctly.

The only problem is that it is still quite slow on a 50k rows words
table but I'll make further tests on a decent server this afternoon.

--
Guillaume

pgsql-hackers by date:

From: Peter Eisentraut
Date: 19 February 2007, 06:25:10
Subject: Re: pg_proc without oid?

From: Gregory Stark
Date: 19 February 2007, 06:51:44
Subject: Short varlena headers and arrays

Re: n-gram search function - Mailing list pgsql-hackers

Previous

Next