Similarity search for sentences - Mailing list pgsql-general

From Janek Sendrowski
Subject Similarity search for sentences
Date
Msg-id trinity-b6932efc-dca4-4f7e-9ed4-dc5ae43701bf-1386244315450@3capp-webde-bs37
Whole thread Raw
Responses Re: Similarity search for sentences  (Rémi Cura <remi.cura@gmail.com>)
Re: Similarity search for sentences  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-general
Hi,
 
I have tables with millions of sentences. Each row contains a sentence. It is natural language and every language is
possible,but the sentences of one table have the same language. 
I have to do a similarity search on them. It has to be very fast, because I have to search for a few hundert sentences
manytimes. 
The search shouldn't be context-based. It should just get sentences with similar words(maybe stemmed).
 
I already had a try with gist/gin-index-based trigramm search (pg_trgm extension), fulltextsearch (tsearch2 extension)
anda pivot-based indexing (Fixed Query Array), but it's all to slow or not suitable. 
Soundex and Metaphone aren't suitable, as well.
 
I'm already working on this project since a long time, but without any success.
Do any of you have an idea?
 
I would be very thankful for help.
 
Janek Sendrowski


pgsql-general by date:

Previous
From: 吕晓旭
Date:
Subject: Re: Help!Why CPU Usage and LoadAverage Jump up Suddenly
Next
From: Rémi Cura
Date:
Subject: Re: Similarity search for sentences