Thread: Full text search with more than one word

Full text search with more than one word

From
"Ritanjali Majihee"
Date:

Hi Team,

 

We have around 20 millions of records in Postgresql database. One of the column is html document stored as Text datatype. We want Use this column in search functionality. We have converted html content to plaintext and created tokens using  to_tsvector. We have implemented full text search on this token column.

 

1.       When we search with one word and logical operators, it returns results as expected. Example: (“Jack” AND (“Jill” OR “Tom”) )

2.       However, when we use two or more worded search strings with logical operators, then it is not giving expected output. Example: for search string ("Jack Jill" OR "Jack Tom"), it is not searching for “Jack Jill” together but if it finds those two words in one text it returns. Here, if the column has “Jack Ryan Mark Jill”, it will return the row but, we are expecting it to return only when it is like  “Jack Jill Ryan Mark”. We found that proximity operator(<->) can be used in between the words. It returns the results as expected but, it takes way more than expected time to get the results; Performance is very poor with proximity operator.

 

Can you please help us solve this issue. Let us know  if you need any details from our side.

 

Thanks,

Ritanjali

Re: Full text search with more than one word

From
Magnus Hagander
Date:

On Fri, Jan 18, 2019 at 11:37 AM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:

Hi Team,

 

We have around 20 millions of records in Postgresql database. One of the column is html document stored as Text datatype. We want Use this column in search functionality. We have converted html content to plaintext and created tokens using  to_tsvector. We have implemented full text search on this token column.

 

1.       When we search with one word and logical operators, it returns results as expected. Example: (“Jack” AND (“Jill” OR “Tom”) )

2.       However, when we use two or more worded search strings with logical operators, then it is not giving expected output. Example: for search string ("Jack Jill" OR "Jack Tom"), it is not searching for “Jack Jill” together but if it finds those two words in one text it returns. Here, if the column has “Jack Ryan Mark Jill”, it will return the row but, we are expecting it to return only when it is like  “Jack Jill Ryan Mark”. We found that proximity operator(<->) can be used in between the words. It returns the results as expected but, it takes way more than expected time to get the results; Performance is very poor with proximity operator.

 

Can you please help us solve this issue. Let us know  if you need any details from our side.

 


It sounds like what you're looking for is phrase search, but it's hard to figure out if you're actually using that and it's not working for you, or if you're not using it. Please provide examples of the exact SQL that you are using to populate your tsvectors and to do the actual searches.

//Magnus

RE: Full text search with more than one word

From
"Ritanjali Majihee"
Date:

Hi Magnus,

 

exact  Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".

 

Where condition like below syntax we are using in Postgresql function

 

SELECT * from tablename where html_tokens  @@ to_tsquery( ' || quote_literal(ikeyword) || ')

 

From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving  result like "Marketing and Coordinator".

Can you please help us solve this issue.

 

Thanks,

Ritanjali

 

From: magnus@hagander.net [mailto:magnus@hagander.net]
Sent: Friday, January 18, 2019 4:11 PM
To: Ritanjali Majihee
Cc: pgsql-general General; Ambiger, Mahantesh
Subject: Re: Full text search with more than one word

 

 

On Fri, Jan 18, 2019 at 11:37 AM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:

Hi Team,

 

We have around 20 millions of records in Postgresql database. One of the column is html document stored as Text datatype. We want Use this column in search functionality. We have converted html content to plaintext and created tokens using  to_tsvector. We have implemented full text search on this token column.

 

1.       When we search with one word and logical operators, it returns results as expected. Example: (“Jack” AND (“Jill” OR “Tom”) )

2.       However, when we use two or more worded search strings with logical operators, then it is not giving expected output. Example: for search string ("Jack Jill" OR "Jack Tom"), it is not searching for “Jack Jill” together but if it finds those two words in one text it returns. Here, if the column has “Jack Ryan Mark Jill”, it will return the row but, we are expecting it to return only when it is like  “Jack Jill Ryan Mark”. We found that proximity operator(<->) can be used in between the words. It returns the results as expected but, it takes way more than expected time to get the results; Performance is very poor with proximity operator.

 

Can you please help us solve this issue. Let us know  if you need any details from our side.

 

 

It sounds like what you're looking for is phrase search, but it's hard to figure out if you're actually using that and it's not working for you, or if you're not using it. Please provide examples of the exact SQL that you are using to populate your tsvectors and to do the actual searches.

 

//Magnus

 

Re: Full text search with more than one word

From
Magnus Hagander
Date:
(please don't top-post on the postgresql mailinglists)

On Fri, Jan 18, 2019 at 3:57 PM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:

Hi Magnus,

 

exact  Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".

 

Where condition like below syntax we are using in Postgresql function

 

SELECT * from tablename where html_tokens  @@ to_tsquery( ' || quote_literal(ikeyword) || ')

 

From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving  result like "Marketing and Coordinator".

Can you please help us solve this issue.

 


You should look into phraseto_tsquery() for searching for phrases, and websearchto_tsquery() for the kind of complete search with or and and in it that you are looking for. to_tsquery doesn't do any parsing like that.

//Magnus
 

RE: Full text search with more than one word

From
"Ritanjali Majihee"
Date:

Hi Magnus ,

 

Example- Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".

 

Thank you for your information , phraseto_tsquery() is supporting two word search or phase search where as it is not supporting logical operator search and to_tsquery() is supporting logical operator search but not supporting two word search or phase search. If we need both functionally  in full text search is there any other way we can proceed .

 

Can you please help us solve this issue.

 

Thanks,

Ritanjali

 

 

From: magnus@hagander.net [mailto:magnus@hagander.net]
Sent: Friday, January 18, 2019 8:45 PM
To: Ritanjali Majihee
Cc: pgsql-general General; Ambiger, Mahantesh; mahantesh.ambiger@yahoo.com
Subject: Re: Full text search with more than one word

 

(please don't top-post on the postgresql mailinglists)

 

On Fri, Jan 18, 2019 at 3:57 PM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:

Hi Magnus,

 

exact  Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".

 

Where condition like below syntax we are using in Postgresql function

 

SELECT * from tablename where html_tokens  @@ to_tsquery( ' || quote_literal(ikeyword) || ')

 

From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving  result like "Marketing and Coordinator".

Can you please help us solve this issue.

 

 

You should look into phraseto_tsquery() for searching for phrases, and websearchto_tsquery() for the kind of complete search with or and and in it that you are looking for. to_tsquery doesn't do any parsing like that.

 

//Magnus

 

Re: Full text search with more than one word

From
Magnus Hagander
Date:
Again, please do not top-post on the postgresql mailinglists, it's considered rude.


On Fri, Jan 25, 2019 at 12:56 PM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:

Hi Magnus ,

 

Example- Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".

 

Thank you for your information , phraseto_tsquery() is supporting two word search or phase search where as it is not supporting logical operator search and to_tsquery() is supporting logical operator search but not supporting two word search or phase search. If we need both functionally  in full text search is there any other way we can proceed .

 

Can you please help us solve this issue.



Yes, as I said in my original email, check out the websearchto_tsquery function.

//Magnus


 

From: magnus@hagander.net [mailto:magnus@hagander.net]
Sent: Friday, January 18, 2019 8:45 PM
To: Ritanjali Majihee
Cc: pgsql-general General; Ambiger, Mahantesh; mahantesh.ambiger@yahoo.com
Subject: Re: Full text search with more than one word

 

(please don't top-post on the postgresql mailinglists)

 

On Fri, Jan 18, 2019 at 3:57 PM Ritanjali Majihee <rmajhee@247headhunting.com> wrote:

Hi Magnus,

 

exact  Search keyword is ("Marketing Coordinator" OR "Marketing Specialist" OR "Marketing Associate" ) AND "Creative Suite".

 

Where condition like below syntax we are using in Postgresql function

 

SELECT * from tablename where html_tokens  @@ to_tsquery( ' || quote_literal(ikeyword) || ')

 

From search keyword we are getting result but not as expected .It is not searching together "Marketing Coordinator" string, It is giving  result like "Marketing and Coordinator".

Can you please help us solve this issue.

 

 

You should look into phraseto_tsquery() for searching for phrases, and websearchto_tsquery() for the kind of complete search with or and and in it that you are looking for. to_tsquery doesn't do any parsing like that.

 

//Magnus

 



--