Re: Very bad FTS performance with the Polish config - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Very bad FTS performance with the Polish config
Date
Msg-id 162867790911180738u17d1b6e1o3cf43062882b5e20@mail.gmail.com
Whole thread Raw
In response to Re: Very bad FTS performance with the Polish config  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-hackers
2009/11/18 Oleg Bartunov <oleg@sai.msu.su>:
> On Wed, 18 Nov 2009, Wojciech Knapik wrote:
>
>>
>>> your polish_english, polish configurations uses ispell language and slow,
>>> while english configuration doesn't contains ispell. So, what's your
>>> complains ? Try add ispell dictionary to english configuration and see
>>> timings.
>>
>> Oh, so this is not anomalous ? These are the expected speeds for an ispell
>> dictionary ? I didn't realize that. Sorry for the bother then. It just
>> seemed way too slow to be practical.
>
> You can see real timings using ts_lexize() function for different
> dictionaries
> (try several time to avoid cold-start problem) instead of ts_headline(),
> which involves other factors.
>
> On my test machine I see no real difference between very simple dictionary
> and french ispell, snowball dictionaries:
>

It's depend on language (and dictionary sizes).

for czech:

postgres=# select ts_lexize('simple','vody');ts_lexize
-----------{vody}
(1 row)

Time: 0.785 ms

postgres=# select ts_lexize('cspell','vody');ts_lexize
-----------{voda}
(1 row)

Time: 1.041 ms

I afraid so czech and polland language is very hard (with long affix file).

Regards
Pavel

> dev-oleg=# select ts_lexize('simple','voila');
>  ts_lexize
> -----------
>  {voila}
> (1 row)
>
> Time: 0.282 ms
> dev-oleg=# select ts_lexize('simple','voila');
>  ts_lexize
> -----------
>  {voila}
> (1 row)
>
> Time: 0.269 ms
>
> dev-oleg=# select ts_lexize('french_stem','voila');
>  ts_lexize
> -----------
>  {voil}
> (1 row)
>
> Time: 0.187 ms
>
> I see no big difference in ts_headline as well:
>
> dev-oleg=# select ts_headline('english','I can do voila', 'voila'::tsquery);
>      ts_headline
> -----------------------
>  I can do <b>voila</b>
> (1 row)
>
> Time: 0.265 ms
> dev-oleg=# select ts_headline('nomaofr','I can do voila', 'voila'::tsquery);
>      ts_headline
> -----------------------
>  I can do <b>voila</b>
> (1 row)
>
> Time: 0.299 ms
>
> This is 8.4.1 version of PostgreSQL.
>
>        Regards,
>                Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Python 3.1 support
Next
From: Oleg Bartunov
Date:
Subject: Re: Very bad FTS performance with the Polish config