Re: tsearch2 column update produces "word too long"error - Mailing list pgsql-general
From | Markus Wollny |
---|---|
Subject | Re: tsearch2 column update produces "word too long"error |
Date | |
Msg-id | 2266D0630E43BB4290742247C891057502B9D366@dozer.computec.de Whole thread Raw |
Responses |
Re: tsearch2 column update produces "word too
Re: tsearch2 column update produces "word too long"error |
List | pgsql-general |
Hi! Now I really couldn't code C to save my life, but I managed to elicit some more debugging info. It's still dumb-user-interaction as suspected, but this is an issue I have to take into account as a basis; here's the "patch" for ts_cfg.c: if (lenlemm >= MAXSTRLEN) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("word is too long(%d): %s",lenlemm,lemm))); Now when I try UPDATE ct_com_board_message SET ftindex=to_tsvector('default',coalesce(user_login,'') ||' '|| coalesce(title,'') ||' '|| coalesce(text,'')); I eventually get: ERROR: word is too long(2724): jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajjajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajjajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajjajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajjajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajjajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajjajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajjajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajjajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajjajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajjajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaja jajajajajajajajajajajajajajajajajajajajajajajajajjajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj ajajajajajajajajajajajajajajajajajajajajajajajajajajajajajaj This is a brightly shining example of utterly wanton user-stupidity, I think: A 2k+ string of |:ja:|. Input like that cannot be helped, though - if he'd been a bit more imaginative, he could have used a few dozen "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch" in a row or anything else; unfortunately there's no app that could automatically whack a user if he's doing something stupid. But on the other hand I cannot think of any reason why crap like that should be indexed in the first place. Therefore I would like to see some sort of option allowing me to still use tsearch2 but actually automatically excluding anything exceeding MAXSTRLEN - so the UPDATE might throw a NOTICE (if anything at all) but still get on with the rest. An alteration like that does however exceed my limited abilities with C by far and I don't want to mess with something I do not fully understand and then use that mess in a production environment. Is there a way to get around this problem with oversized words? Kind regards Markus > -----Ursprüngliche Nachricht----- > Von: Oleg Bartunov [mailto:oleg@sai.msu.su] > Gesendet: Freitag, 21. November 2003 15:13 > An: Markus Wollny > Cc: pgsql-general@postgresql.org > Betreff: Re: AW: [GENERAL] tsearch2 column update produces "word too > long"error > > > On Fri, 21 Nov 2003, Markus Wollny wrote: > > > Hello! > > > > > Von: Oleg Bartunov [mailto:oleg@sai.msu.su] > > > Gesendet: Freitag, 21. November 2003 13:06 > > > An: Markus Wollny > > > Cc: pgsql-general@postgresql.org > > > > > > Word length is limited by 2K. What's exactly the word > > > tsearch2 complained on ? > > > 'Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch' > > > is fine :) > > > > This was a silly example, I know - it is a long word, but > not too long > > to worry a machine. The offending word will surely be much > longer, but > > as a matter of fact, I cannot think of any user actually > typing a 2k+ > > string without any spaces in between. I'm not sure on which word > > tsearch2 complained, it doesn't tell and even logging did > not provide me > > with any more detail: > > > > 2003-11-21 14:06:44 [26497] ERROR: 42601: word is too long > > LOCATION: parsetext_v2, ts_cfg.c:294 > > STATEMENT: UPDATE ct_com_board_message > > SET > > ftindex=to_tsvector('default',coalesce(user_login,'') ||' '|| > > coalesce(title,'') ||' '|| coalesce(text,'')); > > > > Is there some way to find the exact position? > > I'm afraid you need to hack ts_cfg.c:294 yourself to print the word > which's bugging you :) > > > > > > btw, don't forget to configure properly dictionaries, so you > > > don't have a lot of unique words. > > > > I won't forget that; I justed wanted to run a quick-off first test > > before diving deeper into Ispell and other issues which are > as yet a bit > > of a mystery to me. > > > > Kind Regards > > > > Markus > > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 >
pgsql-general by date: