Home > mailing lists

Similarity search for sentences - Mailing list pgsql-general

From	Janek Sendrowski
Subject	Similarity search for sentences
Date	December 5, 2013 11:51:59
Msg-id	trinity-b6932efc-dca4-4f7e-9ed4-dc5ae43701bf-1386244315450@3capp-webde-bs37 Whole thread
Responses	Re: Similarity search for sentences Re: Similarity search for sentences
List	pgsql-general

Tree view

Hi,
 
I have tables with millions of sentences. Each row contains a sentence. It is natural language and every language is
possible,but the sentences of one table have the same language. 
I have to do a similarity search on them. It has to be very fast, because I have to search for a few hundert sentences
manytimes. 
The search shouldn't be context-based. It should just get sentences with similar words(maybe stemmed).
 
I already had a try with gist/gin-index-based trigramm search (pg_trgm extension), fulltextsearch (tsearch2 extension)
anda pivot-based indexing (Fixed Query Array), but it's all to slow or not suitable. 
Soundex and Metaphone aren't suitable, as well.
 
I'm already working on this project since a long time, but without any success.
Do any of you have an idea?
 
I would be very thankful for help.
 
Janek Sendrowski

pgsql-general by date:

From: 吕晓旭
Date: 05 December 2013, 09:27:47
Subject: Re: Help！Why CPU Usage and LoadAverage Jump up Suddenly

From: Rémi Cura
Date: 05 December 2013, 12:13:01
Subject: Re: Similarity search for sentences

Similarity search for sentences - Mailing list pgsql-general

Previous

Next