Re: Similarity search for sentences - Mailing list pgsql-general

From Janek Sendrowski
Subject Re: Similarity search for sentences
Date
Msg-id trinity-58602a33-a07d-4780-b4ea-83b8285e4906-1386339761095@3capp-webde-bs33
Whole thread Raw
In response to Re: Similarity search for sentences  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: Similarity search for sentences  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-general
Hi,
thanks for your Answers.
 
@Rémi Cura
You suggest a kind of Full Text Search.  I already had a try with the tsearch2 extension.
The issue is to realize the similarity search. I have to use many OR statements with a low set of arguments.
That significantly slows the FTS down.
 
@Kevin Grittner
I used my own trigger to store the tsvector of the sentences and I created a usual gist Index on them.
What kind of functional Index would you suggest. Like i already told Rémi, I have to to use many OR statements with a
lowset of arguments, which heavy damages the perfance. 
Do you have a better idea?
I usually used a query like this:
 
The tiger is the largest cat species[http://en.wikipedia.org/wiki/Felidae], reaching a total body length of up to 3.3 m
 andweighing up to 306 kg. 

--------------------------------------------------------------------------------------------------------------------------------------------------
totsvector:
'3.3':16 '306':22 'bodi':11 'cat':6 'kg':23 'largest':5 'length':12 'm':17 'reach':8 'speci':7 'tiger':2 'total':10
'weigh':19
(1 row)
 
SELECT * FROM tablename WHERE vector @@ to_tsquery('speci & tiger & total & weigh') AND vector @@
to_tsquery('largest & length & m & reach')  ANDvector @@ to_tsquery('3.3 & 306 & bodi & cat & kg'); 

And thats very slow
 
I didn't know that the pg_trgm Module provides KNN search.
 
Janek Sendrowski
 
 
 


pgsql-general by date:

Previous
From: vincent elschot
Date:
Subject: Re: postgresql or xquery?
Next
From: Florian Weimer
Date:
Subject: Testing an extension without installing it