Thread: Full text search with more than one word
Hi Team,
We have around 20 millions of records in Postgresql database. One of the column is html document stored as Text datatype. We want Use this column in search functionality. We have converted html content to plaintext and created tokens using to_tsvector. We have implemented full text search on this token column.
1. When we search with one word and logical operators, it returns results as expected. Example: (“Jack” AND (“Jill” OR “Tom”) )
2. However, when we use two or more worded search strings with logical operators, then it is not giving expected output. Example: for search string ("Jack Jill" OR "Jack Tom"), it is not searching for “Jack Jill” together but if it finds those two words in one text it returns. Here, if the column has “Jack Ryan Mark Jill”, it will return the row but, we are expecting it to return only when it is like “Jack Jill Ryan Mark”. We found that proximity operator(<->) can be used in between the words. It returns the results as expected but, it takes way more than expected time to get the results; Performance is very poor with proximity operator.
Can you please help us solve this issue. Let us know if you need any details from our side.
Thanks,
Ritanjali
Hi Team,
We have around 20 millions of records in Postgresql database. One of the column is html document stored as Text datatype. We want Use this column in search functionality. We have converted html content to plaintext and created tokens using to_tsvector. We have implemented full text search on this token column.
1. When we search with one word and logical operators, it returns results as expected. Example: (“Jack” AND (“Jill” OR “Tom”) )
2. However, when we use two or more worded search strings with logical operators, then it is not giving expected output. Example: for search string ("Jack Jill" OR "Jack Tom"), it is not searching for “Jack Jill” together but if it finds those two words in one text it returns. Here, if the column has “Jack Ryan Mark Jill”, it will return the row but, we are expecting it to return only when it is like “Jack Jill Ryan Mark”. We found that proximity operator(<->) can be used in between the words. It returns the results as expected but, it takes way more than expected time to get the results; Performance is very poor with proximity operator.
Can you please help us solve this issue. Let us know if you need any details from our side.
Hi Magnus,
exact Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".
Where condition like below syntax we are using in Postgresql function
SELECT * from tablename where html_tokens @@ to_tsquery( ' || quote_literal(ikeyword) || ')
From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving result like "Marketing and Coordinator".
Can you please help us solve this issue.
Thanks,
Ritanjali
From: magnus@hagander.net [mailto:magnus@hagander.net]
Sent: Friday, January 18, 2019 4:11 PM
To: Ritanjali Majihee
Cc: pgsql-general General; Ambiger, Mahantesh
Subject: Re: Full text search with more than one word
On Fri, Jan 18, 2019 at 11:37 AM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:
Hi Team,
We have around 20 millions of records in Postgresql database. One of the column is html document stored as Text datatype. We want Use this column in search functionality. We have converted html content to plaintext and created tokens using to_tsvector. We have implemented full text search on this token column.
1. When we search with one word and logical operators, it returns results as expected. Example: (“Jack” AND (“Jill” OR “Tom”) )
2. However, when we use two or more worded search strings with logical operators, then it is not giving expected output. Example: for search string ("Jack Jill" OR "Jack Tom"), it is not searching for “Jack Jill” together but if it finds those two words in one text it returns. Here, if the column has “Jack Ryan Mark Jill”, it will return the row but, we are expecting it to return only when it is like “Jack Jill Ryan Mark”. We found that proximity operator(<->) can be used in between the words. It returns the results as expected but, it takes way more than expected time to get the results; Performance is very poor with proximity operator.
Can you please help us solve this issue. Let us know if you need any details from our side.
It sounds like what you're looking for is phrase search, but it's hard to figure out if you're actually using that and it's not working for you, or if you're not using it. Please provide examples of the exact SQL that you are using to populate your tsvectors and to do the actual searches.
//Magnus
Hi Magnus,
exact Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".
Where condition like below syntax we are using in Postgresql function
SELECT * from tablename where html_tokens @@ to_tsquery( ' || quote_literal(ikeyword) || ')
From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving result like "Marketing and Coordinator".
Can you please help us solve this issue.
Hi Magnus ,
Example- Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".
Thank you for your information , phraseto_tsquery() is supporting two word search or phase search where as it is not supporting logical operator search and to_tsquery() is supporting logical operator search but not supporting two word search or phase search. If we need both functionally in full text search is there any other way we can proceed .
Can you please help us solve this issue.
Thanks,
Ritanjali
From: magnus@hagander.net [mailto:magnus@hagander.net]
Sent: Friday, January 18, 2019 8:45 PM
To: Ritanjali Majihee
Cc: pgsql-general General; Ambiger, Mahantesh; mahantesh.ambiger@yahoo.com
Subject: Re: Full text search with more than one word
(please don't top-post on the postgresql mailinglists)
On Fri, Jan 18, 2019 at 3:57 PM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:
Hi Magnus,
exact Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".
Where condition like below syntax we are using in Postgresql function
SELECT * from tablename where html_tokens @@ to_tsquery( ' || quote_literal(ikeyword) || ')
From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving result like "Marketing and Coordinator".
Can you please help us solve this issue.
You should look into phraseto_tsquery() for searching for phrases, and websearchto_tsquery() for the kind of complete search with or and and in it that you are looking for. to_tsquery doesn't do any parsing like that.
//Magnus
Hi Magnus ,
Example- Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".
Thank you for your information , phraseto_tsquery() is supporting two word search or phase search where as it is not supporting logical operator search and to_tsquery() is supporting logical operator search but not supporting two word search or phase search. If we need both functionally in full text search is there any other way we can proceed .
Can you please help us solve this issue.
From: magnus@hagander.net [mailto:magnus@hagander.net]
Sent: Friday, January 18, 2019 8:45 PM
To: Ritanjali Majihee
Cc: pgsql-general General; Ambiger, Mahantesh; mahantesh.ambiger@yahoo.com
Subject: Re: Full text search with more than one word
(please don't top-post on the postgresql mailinglists)
On Fri, Jan 18, 2019 at 3:57 PM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:
Hi Magnus,
exact Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".
Where condition like below syntax we are using in Postgresql function
SELECT * from tablename where html_tokens @@ to_tsquery( ' || quote_literal(ikeyword) || ')
From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving result like "Marketing and Coordinator".
Can you please help us solve this issue.
You should look into phraseto_tsquery() for searching for phrases, and websearchto_tsquery() for the kind of complete search with or and and in it that you are looking for. to_tsquery doesn't do any parsing like that.
//Magnus