Re: full text search to_tsquery performance with ispell dictionary - Mailing list pgsql-general

From Pavel Stehule
Subject Re: full text search to_tsquery performance with ispell dictionary
Date
Msg-id BANLkTik9pr1B1zy+0kFqmSbFv3rVMYV=Kw@mail.gmail.com
Whole thread Raw
In response to full text search to_tsquery performance with ispell dictionary  (Stanislav Raskin <raskin@livn.de>)
Responses Re: full text search to_tsquery performance with ispell dictionary  (Stanislav Raskin <raskin@livn.de>)
List pgsql-general
Hello

2011/5/11 Stanislav Raskin <raskin@livn.de>:
> Hello everybody,
> I was experimenting with the FTS feature on postgres 8.3.4 lately and
> encountered a weird performance issue when using a custom FTS configuration.
> I use this german ispell dictionary, re-encoded to utf8:
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
> With the following configuration:
>
> CREATE TEXT SEARCH CONFIGURATION public.german_de (COPY =
> pg_catalog.german);
>
> CREATE TEXT SEARCH DICTIONARY german_de_ispell (
>
>     TEMPLATE = ispell,
>
>     DictFile = german_de_utf8,
>
>     AffFile = german_de_utf8,
>
>     StopWords = german_de_utf8
>
> );
>
> ALTER TEXT SEARCH CONFIGURATION german_de
>
>     ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
>
>                       word, hword, hword_part
>
>     WITH german_de_ispell, german_stem;
>
> So far so good. Indexing and creation of tsvectors works like a charm.
> The problem is, that if I open a new connection to the database and do
> something like this
> SELECT to_tsquery('german_de', 'abcd');
> it takes A LOT of time for the query to complete for the first time. About
> 1-1,5s. If I submit the same query for a second, third, fourth time and so
> on, it takes only some 10-20ms, which is what I would expect.
> It almost seems as if the dictionary is somehow analyzed or indexed and the
> results cached for each connection, which seems counter-intuitive to me.
> After all, the dictionaries should not change that often.
> Did I miss something or did I do something wrong?
> I'd be thankful for any advice.
> Kind Regards

it is expected behave :( . A loading of ispell dictionary is very slow.

Use a german snowball instead.

You can you a some pooling connection software too.

Regards

Pavel Stehule

> --
>
> Stanislav Raskin
>
> livn GmbH
> Campus Freudenberg
> Rainer-Gruenter-Str. 21
> 42119 Wuppertal
>
> +49(0)202-8 50 66 921
> raskin@livn.de
> http://www.livn.de
>
> livn
> local individual video news GmbH
> Registergericht Wuppertal HRB 20086
>
> Geschäftsführer:
> Dr. Stefan Brües
> Alexander Jacob

pgsql-general by date:

Previous
From: Mark
Date:
Subject: ts_rank vs ts_rank_cd
Next
From: Tom Lane
Date:
Subject: Re: full text search to_tsquery performance with ispell dictionary