Re: FTS performance with the Polish config - Mailing list pgsql-performance

From Oleg Bartunov
Subject Re: FTS performance with the Polish config
Date
Msg-id Pine.LNX.4.64.0911151201360.6801@sn.sai.msu.ru
Whole thread Raw
In response to Re: FTS performance with the Polish config  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: FTS performance with the Polish config
List pgsql-performance
On Sun, 15 Nov 2009, Pavel Stehule wrote:

> 2009/11/15 Oleg Bartunov <oleg@sai.msu.su>:
>> Yes, as stated original author use polish ispell dictionary.
>> Ispell dictionary is slow to load first time. In real life it should be no
>> problem.
>>
>
> it is a problem. People who needs fast access uses english without
> czech. It drop some features, but it is significaly faster.

just don't use ispell dictionary, czech snowball stemmer is as fast as
english.

Ispell dictionary (doesn't matter english, or other language) is slow for the
first load and then it caches, so there is no problem if use persistent
database connection, which is de facto standard for any serious projects.

>
> Pavel
>
>> Oleg
>> On Sat, 14 Nov 2009, Pavel Stehule wrote:
>>
>>> 2009/11/14 Tom Lane <tgl@sss.pgh.pa.us>:
>>>> Kenneth Marshall <ktm@rice.edu> writes:
>>>>> On Sat, Nov 14, 2009 at 12:25:05PM +0100, Wojciech Knapik wrote:
>>>>>> I just finished implementing a "search engine" for my site and found
>>>>>> ts_headline extremely slow when used with a Polish tsearch
>>>>>> configuratio=
>>> n,
>>>>>> while fast with English.
>>>>
>>>>> The documentation for ts_headline() states:
>>>>> ts_headline uses the original document, not a tsvector summary, so it
>>>>> can be slow and should be used with care.
>>>>
>>>> That's true but the argument in the docs would apply just as well to
>>>> english or any other config. =C2=A0So while Wojciech would be well
>>>> advised
>>>> to try to avoid making a lot of calls to ts_headline, it's still curious
>>>> that it's so much slower in polish than english. =C2=A0Could we see a
>>>> self-contained test case?
>>>
>>> is it dictionary based or stem based?
>>>
>>> Dictionary based FTS is very slow (first load). Minimally czech FTS is
>>> slow.
>>>
>>> regards
>>> Pavel Stehule
>>>
>>>>
>>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
>>>> =C2=
>>> =A0 =C2=A0regards, tom lane
>>>>
>>>> --
>>>> Sent via pgsql-performance mailing list
>>>> (pgsql-performance@postgresql.org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-performance
>>>>
>>>
>>> --=20
>>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-performance
>>>
>>
>>        Regards,
>>                Oleg
>> _____________________________________________________________
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

pgsql-performance by date:

Previous
From: Craig Ringer
Date:
Subject: Re: SSD + RAID
Next
From: Pavel Stehule
Date:
Subject: Re: FTS performance with the Polish config