Thread: Using ISpell dictionary - headaches...

Using ISpell dictionary - headaches...

From
Daniel Chiaramello
Date:
Hi everybody.

Well... I have a problem when trying to install and use an ISpell dictionary (the Thai one to be more precise) with the tsearch feature.

What I am trying to do

I have a table containing a "title" field, and I want to fill a "vector" field with the following command:
UPDATE thai_table SET vectors = to_tsvector('thai_utf8', coalesce(title,''));

How I installed the Thai dictionary

I installed the "th_TH.dic" and the "th_TH.aff" files (downloaded from http://wiki.services.openoffice.org/wiki/Dictionaries) in a "/usr/local/share/dicts/ispell/" folder, and I executed the following commands:

SET search_path = public;
BEGIN;

INSERT INTO pg_ts_dict (dict_name, dict_init, dict_initoption, dict_lexize, dict_comment)
VALUES (
        'th_spell_utf8',
        'spell_init(internal)',
        'DictFile="/usr/local/share/dicts/ispell/th_TH.dic",AffFile="/usr/local/share/dicts/ispell/th_TH.aff"',
        'spell_lexize(internal,internal,integer)',
        'Thai ISpell dict utf8 encoding'
    );

INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) VALUES ('thai_utf8', 'default', 'th_TH.utf8');

INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'email', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'url', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'host', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'sfloat', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'version', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'uri', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'file', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'float', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'int', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'uint', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'lword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'nlword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'word', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'part_hword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'nlpart_hword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'lpart_hword', '{th_spell_utf8,simple}');

COMMIT;


What my problem is

The problem is that, when i execute the request to fill my "vectors" field, psql crashes...

la connexion au serveur a été coupée à l'improviste
        Le serveur s'est peut-être arrêté anormalement
        avant ou durant le traitement de la requête.
La connexion au serveur a été perdue. Tentative de réinitialisation: Echec.
!>


(it means: the connection with the server has been cut unexpectedly. The server may have stop abnormaly before or during the request handling. The connection with the server has been lost. Trying to reinitialization: Failed)

I have no idea on what may cause that, nor what I could look for to find idea on how to solve that.

It *may* be because I'm using psql 8.0.3 and not the latest version (but I'm stucked with that version), i'm just hoping that one of you have met similar problem and have successfully solved it, or maybe if you know a site where an Ispell dictionary installation is detailed step by step so that I can check if I did something wrong somewhere...

Many thanks for your attention,
Daniel Chiaramello

Re: Using ISpell dictionary - headaches...

From
Oleg Bartunov
Date:
Daniel,

early versions of tsearch doesn't support directly OpenOffice dictionaries.

Oleg
On Tue, 22 Jul 2008, Daniel Chiaramello wrote:

> Hi everybody.
>
> Well... I have a problem when trying to install and use an ISpell dictionary
> (the Thai one to be more precise) with the tsearch feature.
>
> _What I am trying to do_
>
> I have a table containing a "title" field, and I want to fill a "vector"
> field with the following command:
> *UPDATE thai_table SET vectors = to_tsvector('thai_utf8',
> coalesce(title,''));*
>
> _How I installed the Thai dictionary_
>
> I installed the "th_TH.dic" and the "th_TH.aff" files (downloaded from
> http://wiki.services.openoffice.org/wiki/Dictionaries) in a
> "/usr/local/share/dicts/ispell/" folder, and I executed the following
> commands:
>
> SET search_path = public;
> BEGIN;
>
> INSERT INTO pg_ts_dict (dict_name, dict_init, dict_initoption, dict_lexize,
> dict_comment)
> VALUES (
>       'th_spell_utf8',
>       'spell_init(internal)',
>       'DictFile="/usr/local/share/dicts/ispell/th_TH.dic",AffFile="/usr/local/share/dicts/ispell/th_TH.aff"',
>       'spell_lexize(internal,internal,integer)',
>       'Thai ISpell dict utf8 encoding'
>   );
>
> INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) VALUES ('thai_utf8',
> 'default', 'th_TH.utf8');
>
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'email', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'url', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'host', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'sfloat', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'version', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'uri', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'file', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'float', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'int', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'uint', '{simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'lword', '{th_spell_utf8,simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'nlword', '{th_spell_utf8,simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'word', '{th_spell_utf8,simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'part_hword', '{th_spell_utf8,simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'nlpart_hword', '{th_spell_utf8,simple}');
> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8',
> 'lpart_hword', '{th_spell_utf8,simple}');
>
> COMMIT;
>
> _What my problem is_
>
> The problem is that, when i execute the request to fill my "vectors" field,
> psql crashes...
>
> la connexion au serveur a ?t? coup?e ? l'improviste
>       Le serveur s'est peut-?tre arr?t? anormalement
>       avant ou durant le traitement de la requ?te.
> La connexion au serveur a ?t? perdue. Tentative de r?initialisation: Echec.
> !>
>
> (it means: the connection with the server has been cut unexpectedly. The
> server may have stop abnormaly before or during the request handling. The
> connection with the server has been lost. Trying to reinitialization: Failed)
>
> I have no idea on what may cause that, nor what I could look for to find idea
> on how to solve that.
>
> It *may* be because I'm using psql 8.0.3 and not the latest version (but I'm
> stucked with that version), i'm just hoping that one of you have met similar
> problem and have successfully solved it, or maybe if you know a site where an
> Ispell dictionary installation is detailed step by step so that I can check
> if I did something wrong somewhere...
>
> Many thanks for your attention,
> Daniel Chiaramello
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Using ISpell dictionary - headaches...

From
Teodor Sigaev
Date:
> It *may* be because I'm using psql 8.0.3 and not the latest version (but
> I'm stucked with that version), i'm just hoping that one of you have met

Upgrade to 8.0.17 - there was a several fixes in ISpell code.
--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/