Thread: tsearch bug in 7.2.1?
Hi, I noticed this behaviour: usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE rr.ftiidx ## 's';id | name | description -----+----------------------------------+----------------------------------- ------------------------------------------------202 | Bird's Nest | An egg nestled in a crispy, hot bread roll.293 | Reuben Triple S | Corn beef, swiss cheese and sauerkraut on pumpernickel. 30 | Hedgehogs | This is comfort food at it's yummiest.130 | Hearty Apple & Cinnamon Porridge | A great way to warm you up on a winter's morning. 83 | Banana & Apple Compote | Great way to finish a meal on a cool winter's day.139 | Minestrone | Served with a crusty roll, this soup is a meal on it's own. 75 | Mango Sorbet | A mango-lover's delight. 19 | Chunky Vegetable Chowder | Serve this soup with a crusty roll and it's a hearty meal on a cold winter's eve. 36 | Lemon Fish Rolls | A pleasant way to include fish in your family's diet. (9 rows) usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE rr.ftiidx ## 's|a'; ERROR: Your query contained only stopword(s), ignored usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE rr.ftiidx ## 's|x';id | name | description -----+----------------------------------+----------------------------------- ------------------------------------------------202 | Bird's Nest | An egg nestled in a crispy, hot bread roll.293 | Reuben Triple S | Corn beef, swiss cheese and sauerkraut on pumpernickel. 30 | Hedgehogs | This is comfort food at it's yummiest.130 | Hearty Apple & Cinnamon Porridge | A great way to warm you up on a winter's morning. 83 | Banana & Apple Compote | Great way to finish a meal on a cool winter's day.139 | Minestrone | Served with a crusty roll, this soup is a meal on it's own. 75 | Mango Sorbet | A mango-lover's delight. 19 | Chunky Vegetable Chowder | Serve this soup with a crusty roll and it's a hearty meal on a cold winter's eve. 36 | Lemon Fish Rolls | A pleasant way to include fish in your family's diet. (9 rows) usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE rr.ftiidx ## 'st|a'; ERROR: Your query contained only stopword(s), ignored usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE rr.ftiidx ## 'st|ar';id | name | description ----+------+------------- (0 rows) I don't see how that's correct? Those ERRORs seem to be valid syntax to me... Chris
Actually, looking at this again it's possible that tsearch sees 'a' as a skip word and so doesn't allow a search on it. This makes it _really_ hard for me to parse and check user keywords - maybe a 'isvalidsyntax' sort of function should be included? Hmmm...maybe I could use the cast to ::mquery_txt to check it...but now I have to detect an ERROR condition and deal with it appropriately... Chris > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Christopher > Kings-Lynne > Sent: Thursday, 15 August 2002 1:43 PM > To: Hackers > Subject: [HACKERS] tsearch bug in 7.2.1? > > > Hi, > > I noticed this behaviour: > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > rr.ftiidx ## 's'; > id | name | > description > -----+----------------------------------+------------------------- > ---------- > ------------------------------------------------ > 202 | Bird's Nest | An egg nestled in a crispy, hot > bread roll. > 293 | Reuben Triple S | Corn beef, swiss cheese and > sauerkraut on pumpernickel. > 30 | Hedgehogs | This is comfort food at it's > yummiest. > 130 | Hearty Apple & Cinnamon Porridge | A great way to warm you up on a > winter's morning. > 83 | Banana & Apple Compote | Great way to finish a meal on a > cool winter's day. > 139 | Minestrone | Served with a crusty roll, this > soup is a meal on it's own. > 75 | Mango Sorbet | A mango-lover's delight. > 19 | Chunky Vegetable Chowder | Serve this soup with a > crusty roll > and it's a hearty meal on a cold winter's eve. > 36 | Lemon Fish Rolls | A pleasant way to > include fish in > your family's diet. > (9 rows) > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > rr.ftiidx ## 's|a'; > ERROR: Your query contained only stopword(s), ignored > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > rr.ftiidx ## 's|x'; > id | name | > description > -----+----------------------------------+------------------------- > ---------- > ------------------------------------------------ > 202 | Bird's Nest | An egg nestled in a crispy, hot > bread roll. > 293 | Reuben Triple S | Corn beef, swiss cheese and > sauerkraut on pumpernickel. > 30 | Hedgehogs | This is comfort food at it's > yummiest. > 130 | Hearty Apple & Cinnamon Porridge | A great way to warm you up on a > winter's morning. > 83 | Banana & Apple Compote | Great way to finish a meal on a > cool winter's day. > 139 | Minestrone | Served with a crusty roll, this > soup is a meal on it's own. > 75 | Mango Sorbet | A mango-lover's delight. > 19 | Chunky Vegetable Chowder | Serve this soup with a > crusty roll > and it's a hearty meal on a cold winter's eve. > 36 | Lemon Fish Rolls | A pleasant way to > include fish in > your family's diet. > (9 rows) > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > rr.ftiidx ## 'st|a'; > ERROR: Your query contained only stopword(s), ignored > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > rr.ftiidx ## 'st|ar'; > id | name | description > ----+------+------------- > (0 rows) > > I don't see how that's correct? Those ERRORs seem to be valid syntax to > me... > > Chris > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
tsearch has compiled-in stop-list, it's currently just not flexible as OpenFTS does. We plan to move most functionality to tsearch but currently have no time. Feel free to join us to speedup tsearch development. Oleg On Thu, 15 Aug 2002, Christopher Kings-Lynne wrote: > Actually, looking at this again it's possible that tsearch sees 'a' as a > skip word and so doesn't allow a search on it. This makes it _really_ hard > for me to parse and check user keywords - maybe a 'isvalidsyntax' sort of > function should be included? Hmmm...maybe I could use the cast to > ::mquery_txt to check it...but now I have to detect an ERROR condition and > deal with it appropriately... > > Chris > > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org > > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Christopher > > Kings-Lynne > > Sent: Thursday, 15 August 2002 1:43 PM > > To: Hackers > > Subject: [HACKERS] tsearch bug in 7.2.1? > > > > > > Hi, > > > > I noticed this behaviour: > > > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > > rr.ftiidx ## 's'; > > id | name | > > description > > -----+----------------------------------+------------------------- > > ---------- > > ------------------------------------------------ > > 202 | Bird's Nest | An egg nestled in a crispy, hot > > bread roll. > > 293 | Reuben Triple S | Corn beef, swiss cheese and > > sauerkraut on pumpernickel. > > 30 | Hedgehogs | This is comfort food at it's > > yummiest. > > 130 | Hearty Apple & Cinnamon Porridge | A great way to warm you up on a > > winter's morning. > > 83 | Banana & Apple Compote | Great way to finish a meal on a > > cool winter's day. > > 139 | Minestrone | Served with a crusty roll, this > > soup is a meal on it's own. > > 75 | Mango Sorbet | A mango-lover's delight. > > 19 | Chunky Vegetable Chowder | Serve this soup with a > > crusty roll > > and it's a hearty meal on a cold winter's eve. > > 36 | Lemon Fish Rolls | A pleasant way to > > include fish in > > your family's diet. > > (9 rows) > > > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > > rr.ftiidx ## 's|a'; > > ERROR: Your query contained only stopword(s), ignored > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > > rr.ftiidx ## 's|x'; > > id | name | > > description > > -----+----------------------------------+------------------------- > > ---------- > > ------------------------------------------------ > > 202 | Bird's Nest | An egg nestled in a crispy, hot > > bread roll. > > 293 | Reuben Triple S | Corn beef, swiss cheese and > > sauerkraut on pumpernickel. > > 30 | Hedgehogs | This is comfort food at it's > > yummiest. > > 130 | Hearty Apple & Cinnamon Porridge | A great way to warm you up on a > > winter's morning. > > 83 | Banana & Apple Compote | Great way to finish a meal on a > > cool winter's day. > > 139 | Minestrone | Served with a crusty roll, this > > soup is a meal on it's own. > > 75 | Mango Sorbet | A mango-lover's delight. > > 19 | Chunky Vegetable Chowder | Serve this soup with a > > crusty roll > > and it's a hearty meal on a cold winter's eve. > > 36 | Lemon Fish Rolls | A pleasant way to > > include fish in > > your family's diet. > > (9 rows) > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > > rr.ftiidx ## 'st|a'; > > ERROR: Your query contained only stopword(s), ignored > > usa=# SELECT rr.id, rr.name, rr.description FROM recipe_recipes rr WHERE > > rr.ftiidx ## 'st|ar'; > > id | name | description > > ----+------+------------- > > (0 rows) > > > > I don't see how that's correct? Those ERRORs seem to be valid syntax to > > me... > > > > Chris > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
> tsearch has compiled-in stop-list, it's currently just not flexible > as OpenFTS does. We plan to move most functionality to tsearch but > currently have no time. Feel free to join us to speedup tsearch > development. Unfortunately I'm just as time-deprived :( Chris
On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: > tsearch has compiled-in stop-list, it's currently just not flexible > as OpenFTS does. We plan to move most functionality to tsearch but > currently have no time. Feel free to join us to speedup tsearch > development. Oleg - I think Chris's issue might be the same one I ran into just last night. (BTW, thanks for tsearch and the OpenFTS work, it's really great) My problem is that queries with only stopwords throw an ERROR, rather than a WARNING or NOTICE. This means We've got to deal with catching an exception so our middleware doesn't spew ugly errors and tracebacks at our endusers, and I've got to deal with cleaning up the transaction. Having the behavior be "issue a notice and return no match" would give us a reasonably functional interface: if I don't implement reading the NOTICE, I get confused users ('huh? "the" doesn't match anything?') rather than irate users ('Your search interface sucks! It keeps crashing!') Oh, well, off to implement some try: catch: logic. Ross
Ross - maybe we could work on a little function for tsearch - parse_query() or something like that. It could return true or false depending on whether it would cause tsearch to error or not... Chris > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Ross J. > Reedstrom > Sent: Friday, 16 August 2002 4:59 AM > To: Oleg Bartunov > Cc: Christopher Kings-Lynne; Hackers > Subject: Re: [HACKERS] tsearch bug in 7.2.1? > > > On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: > > tsearch has compiled-in stop-list, it's currently just not flexible > > as OpenFTS does. We plan to move most functionality to tsearch but > > currently have no time. Feel free to join us to speedup tsearch > > development. > > Oleg - > I think Chris's issue might be the same one I ran into just last night. > (BTW, thanks for tsearch and the OpenFTS work, it's really great) > My problem is that queries with only stopwords throw an ERROR, rather > than a WARNING or NOTICE. This means We've got to deal with catching an > exception so our middleware doesn't spew ugly errors and tracebacks at > our endusers, and I've got to deal with cleaning up the transaction. > > Having the behavior be "issue a notice and return no match" would give > us a reasonably functional interface: if I don't implement reading the > NOTICE, I get confused users ('huh? "the" doesn't match anything?') > rather than irate users ('Your search interface sucks! It keeps > crashing!') > > Oh, well, off to implement some try: catch: logic. > > Ross > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Ross and Chris, I was reading too fast :-) The problem is actually more complex: We have to distinguish 4 cases when Query could returns zero result: 1. normal - there are no such words 2. query consists of stop-words only 3. query consists of lexems of non-indexed clasess (specified in parser currently) 4. combination of 2 and 3 Ideally, we want to inform middleware all these cases. Oleg On Thu, 15 Aug 2002, Ross J. Reedstrom wrote: > On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: > > tsearch has compiled-in stop-list, it's currently just not flexible > > as OpenFTS does. We plan to move most functionality to tsearch but > > currently have no time. Feel free to join us to speedup tsearch > > development. > > Oleg - > I think Chris's issue might be the same one I ran into just last night. > (BTW, thanks for tsearch and the OpenFTS work, it's really great) > My problem is that queries with only stopwords throw an ERROR, rather > than a WARNING or NOTICE. This means We've got to deal with catching an > exception so our middleware doesn't spew ugly errors and tracebacks at > our endusers, and I've got to deal with cleaning up the transaction. > > Having the behavior be "issue a notice and return no match" would give > us a reasonably functional interface: if I don't implement reading the > NOTICE, I get confused users ('huh? "the" doesn't match anything?') > rather than irate users ('Your search interface sucks! It keeps > crashing!') > > Oh, well, off to implement some try: catch: logic. > > Ross > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
No you can use: regression=# select 'the'::mquery_txt; ERROR: Your query contained only stopword(s), ignored regression=# select 'good'::mquery_txt; mquery_txt ------------ 'good' (1 row) I suggest: 1.regression=# select 'the'::mquery_txt;NOTICE: Your query contained only stopword(s), ignoredmquery_txt-------------(1row) 2. any operation with void query returns false:select t from tbl where t ## 'the';NOTICE: Your query contained only stopword(s),ignoredtbl-----(0 row) Christopher Kings-Lynne wrote: > Ross - maybe we could work on a little function for tsearch - parse_query() > or something like that. It could return true or false depending on whether > it would cause tsearch to error or not... > > Chris > > >>-----Original Message----- >>From: pgsql-hackers-owner@postgresql.org >>[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Ross J. >>Reedstrom >>Sent: Friday, 16 August 2002 4:59 AM >>To: Oleg Bartunov >>Cc: Christopher Kings-Lynne; Hackers >>Subject: Re: [HACKERS] tsearch bug in 7.2.1? >> >> >>On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: >> >>>tsearch has compiled-in stop-list, it's currently just not flexible >>>as OpenFTS does. We plan to move most functionality to tsearch but >>>currently have no time. Feel free to join us to speedup tsearch >>>development. >> >>Oleg - >>I think Chris's issue might be the same one I ran into just last night. >>(BTW, thanks for tsearch and the OpenFTS work, it's really great) >>My problem is that queries with only stopwords throw an ERROR, rather >>than a WARNING or NOTICE. This means We've got to deal with catching an >>exception so our middleware doesn't spew ugly errors and tracebacks at >>our endusers, and I've got to deal with cleaning up the transaction. >> >>Having the behavior be "issue a notice and return no match" would give >>us a reasonably functional interface: if I don't implement reading the >>NOTICE, I get confused users ('huh? "the" doesn't match anything?') >>rather than irate users ('Your search interface sucks! It keeps >>crashing!') >> >>Oh, well, off to implement some try: catch: logic. >> >>Ross >> >>---------------------------(end of broadcast)--------------------------- >>TIP 4: Don't 'kill -9' the postmaster >> > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Teodor Sigaev teodor@stack.net
On Fri, 16 Aug 2002, Christopher Kings-Lynne wrote: > Ross - maybe we could work on a little function for tsearch - parse_query() > or something like that. It could return true or false depending on whether > it would cause tsearch to error or not... In principle, the right way is to use the same parser and the same dictionaries for query parse, which were used in indexing ! That's the way OpenFTS does its work, so OpenFTS knows if resulted query would be void and return warning message *before* sending query to db. That's why we were didn't concerned about error message returned by tsearch. But current implementation of tsearch doesn't have api to their parser and dictionaries, so you couldn't write parse_query(). I'd suggest to write check_query() which could use Teodor's suggesting (see his message) - very cheap select like select 'good query'::mquery_txt; This will works after applying patch, we'll submit if you agree with Teodor's suggestion. The "right way" will be possible in new incarnation of tsearch with all functionality of OpenFTS. Regards, Oleg > > Chris > > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org > > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Ross J. > > Reedstrom > > Sent: Friday, 16 August 2002 4:59 AM > > To: Oleg Bartunov > > Cc: Christopher Kings-Lynne; Hackers > > Subject: Re: [HACKERS] tsearch bug in 7.2.1? > > > > > > On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: > > > tsearch has compiled-in stop-list, it's currently just not flexible > > > as OpenFTS does. We plan to move most functionality to tsearch but > > > currently have no time. Feel free to join us to speedup tsearch > > > development. > > > > Oleg - > > I think Chris's issue might be the same one I ran into just last night. > > (BTW, thanks for tsearch and the OpenFTS work, it's really great) > > My problem is that queries with only stopwords throw an ERROR, rather > > than a WARNING or NOTICE. This means We've got to deal with catching an > > exception so our middleware doesn't spew ugly errors and tracebacks at > > our endusers, and I've got to deal with cleaning up the transaction. > > > > Having the behavior be "issue a notice and return no match" would give > > us a reasonably functional interface: if I don't implement reading the > > NOTICE, I get confused users ('huh? "the" doesn't match anything?') > > rather than irate users ('Your search interface sucks! It keeps > > crashing!') > > > > Oh, well, off to implement some try: catch: logic. > > > > Ross > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Please, apply patch for tsearch to current CVS. Patch resolve ERROR problem for non-goog query_txt. Teodor Sigaev wrote: > Now you can use: > regression=# select 'the'::mquery_txt; > ERROR: Your query contained only stopword(s), ignored > regression=# select 'good'::mquery_txt; > mquery_txt > ------------ > 'good' > (1 row) > > I suggest: > 1. > regression=# select 'the'::mquery_txt; > NOTICE: Your query contained only stopword(s), ignored > mquery_txt > ------------- > > (1 row) > 2. any operation with void query returns false: > select t from tbl where t ## 'the'; > NOTICE: Your query contained only stopword(s), ignored > tbl > ----- > (0 row) > > > > > > Christopher Kings-Lynne wrote: > >> Ross - maybe we could work on a little function for tsearch - >> parse_query() >> or something like that. It could return true or false depending on >> whether >> it would cause tsearch to error or not... >> >> Chris >> >> >>> -----Original Message----- >>> From: pgsql-hackers-owner@postgresql.org >>> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Ross J. >>> Reedstrom >>> Sent: Friday, 16 August 2002 4:59 AM >>> To: Oleg Bartunov >>> Cc: Christopher Kings-Lynne; Hackers >>> Subject: Re: [HACKERS] tsearch bug in 7.2.1? >>> >>> >>> On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: >>> >>>> tsearch has compiled-in stop-list, it's currently just not flexible >>>> as OpenFTS does. We plan to move most functionality to tsearch but >>>> currently have no time. Feel free to join us to speedup tsearch >>>> development. >>> >>> >>> Oleg - >>> I think Chris's issue might be the same one I ran into just last night. >>> (BTW, thanks for tsearch and the OpenFTS work, it's really great) >>> My problem is that queries with only stopwords throw an ERROR, rather >>> than a WARNING or NOTICE. This means We've got to deal with catching an >>> exception so our middleware doesn't spew ugly errors and tracebacks at >>> our endusers, and I've got to deal with cleaning up the transaction. >>> >>> Having the behavior be "issue a notice and return no match" would give >>> us a reasonably functional interface: if I don't implement reading the >>> NOTICE, I get confused users ('huh? "the" doesn't match anything?') >>> rather than irate users ('Your search interface sucks! It keeps >>> crashing!') >>> >>> Oh, well, off to implement some try: catch: logic. >>> >>> Ross >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 4: Don't 'kill -9' the postmaster >>> >> >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 3: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> > > -- Teodor Sigaev teodor@stack.net
Attachment
Your patch has been added to the PostgreSQL unapplied patches list at: http://candle.pha.pa.us/cgi-bin/pgpatches I will try to apply it within the next 48 hours. --------------------------------------------------------------------------- Teodor Sigaev wrote: > Please, apply patch for tsearch to current CVS. > > Patch resolve ERROR problem for non-goog query_txt. > > Teodor Sigaev wrote: > > Now you can use: > > regression=# select 'the'::mquery_txt; > > ERROR: Your query contained only stopword(s), ignored > > regression=# select 'good'::mquery_txt; > > mquery_txt > > ------------ > > 'good' > > (1 row) > > > > I suggest: > > 1. > > regression=# select 'the'::mquery_txt; > > NOTICE: Your query contained only stopword(s), ignored > > mquery_txt > > ------------- > > > > (1 row) > > 2. any operation with void query returns false: > > select t from tbl where t ## 'the'; > > NOTICE: Your query contained only stopword(s), ignored > > tbl > > ----- > > (0 row) > > > > > > > > > > > > Christopher Kings-Lynne wrote: > > > >> Ross - maybe we could work on a little function for tsearch - > >> parse_query() > >> or something like that. It could return true or false depending on > >> whether > >> it would cause tsearch to error or not... > >> > >> Chris > >> > >> > >>> -----Original Message----- > >>> From: pgsql-hackers-owner@postgresql.org > >>> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Ross J. > >>> Reedstrom > >>> Sent: Friday, 16 August 2002 4:59 AM > >>> To: Oleg Bartunov > >>> Cc: Christopher Kings-Lynne; Hackers > >>> Subject: Re: [HACKERS] tsearch bug in 7.2.1? > >>> > >>> > >>> On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: > >>> > >>>> tsearch has compiled-in stop-list, it's currently just not flexible > >>>> as OpenFTS does. We plan to move most functionality to tsearch but > >>>> currently have no time. Feel free to join us to speedup tsearch > >>>> development. > >>> > >>> > >>> Oleg - > >>> I think Chris's issue might be the same one I ran into just last night. > >>> (BTW, thanks for tsearch and the OpenFTS work, it's really great) > >>> My problem is that queries with only stopwords throw an ERROR, rather > >>> than a WARNING or NOTICE. This means We've got to deal with catching an > >>> exception so our middleware doesn't spew ugly errors and tracebacks at > >>> our endusers, and I've got to deal with cleaning up the transaction. > >>> > >>> Having the behavior be "issue a notice and return no match" would give > >>> us a reasonably functional interface: if I don't implement reading the > >>> NOTICE, I get confused users ('huh? "the" doesn't match anything?') > >>> rather than irate users ('Your search interface sucks! It keeps > >>> crashing!') > >>> > >>> Oh, well, off to implement some try: catch: logic. > >>> > >>> Ross > >>> > >>> ---------------------------(end of broadcast)--------------------------- > >>> TIP 4: Don't 'kill -9' the postmaster > >>> > >> > >> > >> > >> ---------------------------(end of broadcast)--------------------------- > >> TIP 3: if posting/reading through Usenet, please send an appropriate > >> subscribe-nomail command to majordomo@postgresql.org so that your > >> message can get through to the mailing list cleanly > >> > > > > > > > -- > Teodor Sigaev > teodor@stack.net > [ application/gzip is not supported, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Patch applied. Thanks. --------------------------------------------------------------------------- Teodor Sigaev wrote: > Please, apply patch for tsearch to current CVS. > > Patch resolve ERROR problem for non-goog query_txt. > > Teodor Sigaev wrote: > > Now you can use: > > regression=# select 'the'::mquery_txt; > > ERROR: Your query contained only stopword(s), ignored > > regression=# select 'good'::mquery_txt; > > mquery_txt > > ------------ > > 'good' > > (1 row) > > > > I suggest: > > 1. > > regression=# select 'the'::mquery_txt; > > NOTICE: Your query contained only stopword(s), ignored > > mquery_txt > > ------------- > > > > (1 row) > > 2. any operation with void query returns false: > > select t from tbl where t ## 'the'; > > NOTICE: Your query contained only stopword(s), ignored > > tbl > > ----- > > (0 row) > > > > > > > > > > > > Christopher Kings-Lynne wrote: > > > >> Ross - maybe we could work on a little function for tsearch - > >> parse_query() > >> or something like that. It could return true or false depending on > >> whether > >> it would cause tsearch to error or not... > >> > >> Chris > >> > >> > >>> -----Original Message----- > >>> From: pgsql-hackers-owner@postgresql.org > >>> [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Ross J. > >>> Reedstrom > >>> Sent: Friday, 16 August 2002 4:59 AM > >>> To: Oleg Bartunov > >>> Cc: Christopher Kings-Lynne; Hackers > >>> Subject: Re: [HACKERS] tsearch bug in 7.2.1? > >>> > >>> > >>> On Thu, Aug 15, 2002 at 11:59:20AM +0300, Oleg Bartunov wrote: > >>> > >>>> tsearch has compiled-in stop-list, it's currently just not flexible > >>>> as OpenFTS does. We plan to move most functionality to tsearch but > >>>> currently have no time. Feel free to join us to speedup tsearch > >>>> development. > >>> > >>> > >>> Oleg - > >>> I think Chris's issue might be the same one I ran into just last night. > >>> (BTW, thanks for tsearch and the OpenFTS work, it's really great) > >>> My problem is that queries with only stopwords throw an ERROR, rather > >>> than a WARNING or NOTICE. This means We've got to deal with catching an > >>> exception so our middleware doesn't spew ugly errors and tracebacks at > >>> our endusers, and I've got to deal with cleaning up the transaction. > >>> > >>> Having the behavior be "issue a notice and return no match" would give > >>> us a reasonably functional interface: if I don't implement reading the > >>> NOTICE, I get confused users ('huh? "the" doesn't match anything?') > >>> rather than irate users ('Your search interface sucks! It keeps > >>> crashing!') > >>> > >>> Oh, well, off to implement some try: catch: logic. > >>> > >>> Ross > >>> > >>> ---------------------------(end of broadcast)--------------------------- > >>> TIP 4: Don't 'kill -9' the postmaster > >>> > >> > >> > >> > >> ---------------------------(end of broadcast)--------------------------- > >> TIP 3: if posting/reading through Usenet, please send an appropriate > >> subscribe-nomail command to majordomo@postgresql.org so that your > >> message can get through to the mailing list cleanly > >> > > > > > > > -- > Teodor Sigaev > teodor@stack.net > [ application/gzip is not supported, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Hi guys, Hate to keep coming up with these bugs without patches - but I really don't have time to look into the source code atm :( OK, attached is an example of the problem. Notice how trademarks and copyright symbols are being indexed along with the word. This means that if someone searches for 'balance' in the above data set, they won't find anything. I'm not sure how this would be handled. In the English language, it'd probably be safe to say that high ascii characters would be stripped from the index? But you'd want to leave accents and stuff in I guess. Tricky. Anyway, just bringing it to your attention... Chris
Attachment
On Fri, 23 Aug 2002, Christopher Kings-Lynne wrote: > Hi guys, > > Hate to keep coming up with these bugs without patches - but I really don't > have time to look into the source code atm :( > > OK, attached is an example of the problem. Notice how trademarks and > copyright symbols are being indexed along with the word. This means that if > someone searches for 'balance' in the above data set, they won't find > anything. > > I'm not sure how this would be handled. In the English language, it'd > probably be safe to say that high ascii characters would be stripped from > the index? But you'd want to leave accents and stuff in I guess. Tricky. Rather tricky. The problem is that we don't know how to get flex to works with locale. Parser recognizes latin words ([a-zA-Z]), nonLatin ([\0200-\0377]) and mixed words ([a-zA-Z\0200-\0377]). Your case (balanceR) is the mixed word. The right way is to have locale aware parser to properly recognize words. We incline to refuse a flex. > > Anyway, just bringing it to your attention... > > Chris > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83