Tsearch2 crashes my backend, ouch ! - Mailing list pgsql-general
From | Listmail |
---|---|
Subject | Tsearch2 crashes my backend, ouch ! |
Date | |
Msg-id | op.tpz1sgd2zcizji@apollo13 Whole thread Raw |
Responses |
Re: Tsearch2 crashes my backend, ouch !
Re: Tsearch2 crashes my backend, ouch ! Re: Tsearch2 crashes my backend, ouch ! |
List | pgsql-general |
Hello, I have just ditched Gentoo and installed a brand new kubuntu system (was tired of the endless compiles). I have a problem with crashing tsearch2. This appeared both on Gentoo and the brand new kubuntu. I will describe all my install procedure, maybe I'm doing something wrong. Cluster is newly created and empty. initdb was done with UNICODE encoding & locales. # from postgresql.conf # These settings are initialized by initdb -- they might be changed lc_messages = 'fr_FR.UTF-8' # locale for system error message strings lc_monetary = 'fr_FR.UTF-8' # locale for monetary formatting lc_numeric = 'fr_FR.UTF-8' # locale for number formatting lc_time = 'fr_FR.UTF-8' # locale for time formatting peufeu@apollo13:~$ locale LANG=fr_FR.UTF-8 LC_CTYPE="fr_FR.UTF-8" LC_NUMERIC="fr_FR.UTF-8" etc... First import needed .sql files from contrib and check that the default tsearch2 config works for English $ createdb -U postgres test $ psql -U postgres test <tsearch2.sql and other contribs I use $ psql -U postgres test test=# select lexize( 'en_stem', 'flying' ); lexize -------- {fli} test=# select to_tsvector('default', 'flying ducks'); to_tsvector ------------------ 'fli':1 'duck':2 OK, seems to work very nicely, now install French. Since this is Kubuntu there is no source, so download source, then : - apply patch_tsearch_snowball_82 from tsearch2 website ./configure --prefix=/usr/lib/postgresql/8.2/ --datadir=/usr/share/postgresql/8.2 --enable-nls=fr --with-python cd contrib/tsearch2 make cd gendict (copy french stem.c and stem.h from the snowball website) ./config.sh -n fr -s -p french_UTF_8 -i -v -c stem.c -h stem.h -C'Snowball stemmer for French' cd ../../dict_fr make clean && make sudo make install Now we have : /bin/sh ../../config/install-sh -c -m 644 dict_fr.sql '/usr/share/postgresql/8.2/contrib' /bin/sh ../../config/install-sh -c -m 755 libdict_fr.so.0.0 '/usr/lib/postgresql/8.2/lib/dict_fr.so' Okay... - download and install UTF8 french dictionaries from http://www.davidgis.fr/download/tsearch2_french_files.zip and put them in contrib directory (the files delivered by debian package ifrench are ISO8859, bleh) - import french shared libs psql -U postgres test < /usr/share/postgresql/8.2/contrib/dict_fr.sql Then : test=# select lexize( 'en_stem', 'flying' ); lexize -------- {fli} And : test=# select * from pg_ts_dict where dict_name ~ '^(fr|en)'; dict_name | dict_init | dict_initoption | dict_lexize | dict_comment -----------+-----------------------+----------------------+---------------------------------------+----------------------------- en_stem | snb_en_init(internal) | contrib/english.stop | snb_lexize(internal,internal,integer) | English Stemmer. Snowball. fr | dinit_fr(internal) | | snb_lexize(internal,internal,integer) | Snowball stemmer for French test=# select lexize( 'fr', 'voyageur' ); server closed the connection unexpectedly BLAM ! Try something else : test=# UPDATE pg_ts_dict SET dict_initoption='/usr/share/postgresql/8.2/contrib/french.stop' WHERE dict_name = 'fr'; UPDATE 1 test=# select lexize( 'fr', 'voyageur' ); server closed the connection unexpectedly Try other options : dict_name | fr_ispell dict_init | spell_init(internal) dict_initoption | DictFile="/usr/share/postgresql/8.2/contrib/french.dict",AffFile="/usr/share/postgresql/8.2/contrib/french.aff",StopFile="/usr/share/postgresql/8.2/contrib/french.stop" dict_lexize | spell_lexize(internal,internal,integer) dict_comment | test=# select lexize( 'en_stem', 'traveler' ), lexize( 'fr_ispell', 'voyageur' ); -[ RECORD 1 ]------- lexize | {travel} lexize | {voyageuse} Now it works (kinda) but stemming doesn't stem for French (since snowball is out). It should return 'voyage' (=travel) instead of 'voyageuse' (=female traveler) That's now what I want ; i want to use snowball to stem French words. I'm going to make a debug build and try to debug it, but if anyone can help, you're really, really welcome.
pgsql-general by date: