Thread: Tsearch2 - spanish
Hi I had installed postgresql-8.2.4 and tsearch2 with dictionary spanish. My problem is: prueba=# select to_tsvector('espanol','melón'); ERROR: Affix parse error at 506 line And if execute: prueba=# select lexize('sp','melón'); lexize --------- {melon} (1 row) I tried many dictionaries with the same results. Also I change the codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to iso88591") and got the same error where can I investigate for resolve about this problem? My dictionary at 506 line had: flag *J: # isimo E > -E, ÍSIMO # grande grandísimo E > -E, ÍSIMOS # grande grandísimos E > -E, ÍSIMA # grande grandísima E > -E, ÍSIMAS # grande grandísimas O > -O, ÍSIMO # tonto tontísimo O > -O, ÍSIMA # tonto tontísima O > -O, ÍSIMOS # tonto tontísimos O > -O, ÍSIMAS # tonto tontísimas L > ÍSIMO # formal formalísimo L > ÍSIMA # formal formalísima L > ÍSIMOS # formal formalísimos L > ÍSIMAS # formal formalísimas If removed "Í" then I don't have problem, but the lexema is incorrect I saw the post http://archives.postgresql.org/pgsql-general/2007-07/msg00888.php Maybe Marcelo had resolve the problem, can you tell me your configuration of tsearch2? best regards PD I need to resolve it for my work
> prueba=# select to_tsvector('espanol','melón'); > ERROR: Affix parse error at 506 line and > prueba=# select lexize('sp','melón'); > lexize > --------- > {melon} > (1 row) Looks very strange, can you provide list of dictionaries and configuration map? > I tried many dictionaries with the same results. Also I change the > codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to > iso88591") and got the same error > > where can I investigate for resolve about this problem? > > My dictionary at 506 line had: Where do you take this file? And what is encdoing/locale setting of your db? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Hi You are rigth, the output of "show lc_ctype;" is C. Then I did is: prueba1=# show lc_ctype; lc_ctype ----------------- es_MX.ISO8859-1 (1 row) and do it % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1 (how you do say) and "createdb -E iso8859-1 prueba1" and finally tsearch2 the original problem is resolved prueba1=# select to_tsvector('espanol','melón'); to_tsvector ------------- 'melón':1 (1 row) but if I change the sentece for it: prueba1=# select to_tsvector('espanol','melón perro mordelón'); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> ??? lost the connection ... the server is up .... any idea? The synonym is intentional thanks in advanced El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev escribió: > > LC_CTYPE="POSIX" > > > pls, output of "show lc_ctype;" command. If it's C locale then I can identify > problem - characters diacritical mark (as ó) is not an alpha character, and > ispell dictionary will fail. To fix that you should run initdb with options: > % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1 > or > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8 > > In last case you should also recode all dictionary's datafile in utf8 encoding. > > >>> prueba=# select to_tsvector('espanol','melón'); > >>> ERROR: Affix parse error at 506 line > >> and > >>> prueba=# select lexize('sp','melón'); > >>> lexize > >>> --------- > >>> {melon} > >>> (1 row) > sp is a Snowball stemmer, it doesn't require affix file, so it works. > > By the way, why is synonym dictionary paced after ispell? is it intentional? > Usually, synonym dictionary goes first, then ispell and after all of them snowball. >
> prueba1=# select to_tsvector('espanol','melón perro mordelón'); > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > !> > Hmm, can you provide backtrace? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Felipe --- Felipe de Jesús Molina Bravo <felipe.molina@inegi.gob.mx> escribió: > Hi > > You are rigth, the output of "show lc_ctype;" is C. > > Then I did is: > > prueba1=# show lc_ctype; > lc_ctype > ----------------- > es_MX.ISO8859-1 > (1 row) > > and do it > > % initdb -D /YOUR/PATH -E LATIN1 --locale > es_ES.ISO8859-1 > > (how you do say) > > and "createdb -E iso8859-1 prueba1" and finally > tsearch2 > > the original problem is resolved > > prueba1=# select to_tsvector('espanol','melón'); > to_tsvector > ------------- > 'melón':1 > (1 row) > > > but if I change the sentece for it: > > prueba1=# select to_tsvector('espanol','melón perro > mordelón'); > server closed the connection unexpectedly > This probably means the server terminated > abnormally > before or while processing the request. > The connection to the server was lost. Attempting > reset: Failed. > !> The same thing he same thing happened my to me at first time with Tsearch2 - spanish , i think you need patch snowball with tsearch_snowball_82 file , googling you find instructions how doit . best regards mdc > > > ??? lost the connection ... the server is up .... > any idea? > > The synonym is intentional > > > thanks in advanced > > > El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev > escribió: > > > LC_CTYPE="POSIX" > > > > > > pls, output of "show lc_ctype;" command. If it's C > locale then I can identify > > problem - characters diacritical mark (as ó) is > not an alpha character, and > > ispell dictionary will fail. To fix that you > should run initdb with options: > > % initdb -D /YOUR/PATH -E LATIN1 --locale > es_ES.ISO8859-1 > > or > > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8 > > > > In last case you should also recode all > dictionary's datafile in utf8 encoding. > > > > >>> prueba=# select > to_tsvector('espanol','melón'); > > >>> ERROR: Affix parse error at 506 line > > >> and > > >>> prueba=# select lexize('sp','melón'); > > >>> lexize > > >>> --------- > > >>> {melon} > > >>> (1 row) > > sp is a Snowball stemmer, it doesn't require affix > file, so it works. > > > > By the way, why is synonym dictionary paced after > ispell? is it intentional? > > Usually, synonym dictionary goes first, then > ispell and after all of them snowball. > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to > majordomo@postgresql.org so that your > message can get through to the mailing list > cleanly > Seguí de cerca a la Selección Argentina de Rugby en el Mundial de Francia 2007. http://ar.sports.yahoo.com/mundialderugby
Hi Thank's Teodor and Marcelo the problem is solved regards -----Mensaje original----- De: marcelo Cortez [mailto:jmdc_marcelo@yahoo.com.ar] Enviado el: jue 20/09/2007 7:13 Para: MOLINA BRAVO FELIPE DE JESUS; Teodor Sigaev CC: PostgreSQL General Asunto: Re: [GENERAL] Tsearch2 - spanish Felipe --- Felipe de Jesús Molina Bravo <felipe.molina@inegi.gob.mx> escribió: > Hi > > You are rigth, the output of "show lc_ctype;" is C. > > Then I did is: > > prueba1=# show lc_ctype; > lc_ctype > ----------------- > es_MX.ISO8859-1 > (1 row) > > and do it > > % initdb -D /YOUR/PATH -E LATIN1 --locale > es_ES.ISO8859-1 > > (how you do say) > > and "createdb -E iso8859-1 prueba1" and finally > tsearch2 > > the original problem is resolved > > prueba1=# select to_tsvector('espanol','melón'); > to_tsvector > ------------- > 'melón':1 > (1 row) > > > but if I change the sentece for it: > > prueba1=# select to_tsvector('espanol','melón perro > mordelón'); > server closed the connection unexpectedly > This probably means the server terminated > abnormally > before or while processing the request. > The connection to the server was lost. Attempting > reset: Failed. > !> The same thing he same thing happened my to me at first time with Tsearch2 - spanish , i think you need patch snowball with tsearch_snowball_82 file , googling you find instructions how doit . best regards mdc > > > ??? lost the connection ... the server is up .... > any idea? > > The synonym is intentional > > > thanks in advanced > > > El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev > escribió: > > > LC_CTYPE="POSIX" > > > > > > pls, output of "show lc_ctype;" command. If it's C > locale then I can identify > > problem - characters diacritical mark (as ó) is > not an alpha character, and > > ispell dictionary will fail. To fix that you > should run initdb with options: > > % initdb -D /YOUR/PATH -E LATIN1 --locale > es_ES.ISO8859-1 > > or > > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8 > > > > In last case you should also recode all > dictionary's datafile in utf8 encoding. > > > > >>> prueba=# select > to_tsvector('espanol','melón'); > > >>> ERROR: Affix parse error at 506 line > > >> and > > >>> prueba=# select lexize('sp','melón'); > > >>> lexize > > >>> --------- > > >>> {melon} > > >>> (1 row) > > sp is a Snowball stemmer, it doesn't require affix > file, so it works. > > > > By the way, why is synonym dictionary paced after > ispell? is it intentional? > > Usually, synonym dictionary goes first, then > ispell and after all of them snowball. > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to > majordomo@postgresql.org so that your > message can get through to the mailing list > cleanly > Seguí de cerca a la Selección Argentina de Rugby en el Mundial de Francia 2007. http://ar.sports.yahoo.com/mundialderugby
Hello group :) How do a clear bits in a number in PostGreSQL? in c++ its: 0xffffff00 &~ 0x0000ffff what is it in PostGreSQL from the psql command line app? select ... Thanx:)
nevermind, I figured it out ... fails: 0xffffff00 &~ 0x0000ffff succeeds: 0xffffff00 & ~ 0x0000ffff I had to add a space. ----- Original Message ----- From: "madhtr" <madhtr@schif.org> To: "PostgreSQL General" <pgsql-general@postgresql.org> Sent: Thursday, September 20, 2007 13:01 Subject: [GENERAL] How to clear bits? > Hello group :) > > How do a clear bits in a number in PostGreSQL? > > in c++ its: > > 0xffffff00 &~ 0x0000ffff > > what is it in PostGreSQL from the psql command line app? > > select ... > > Thanx:) > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match