Re: Tsearch2 performance on big database - Mailing list pgsql-performance

From Oleg Bartunov
Subject Re: Tsearch2 performance on big database
Date
Msg-id Pine.GSO.4.62.0503241343580.5508@ra.sai.msu.su
Whole thread Raw
In response to Re: Tsearch2 performance on big database  (Rick Jansen <rick@rockingstone.nl>)
Responses Re: Tsearch2 performance on big database  (Rick Jansen <rick@rockingstone.nl>)
List pgsql-performance
On Thu, 24 Mar 2005, Rick Jansen wrote:

> Oleg Bartunov wrote:
>> from my notes
>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes
>>
>> It's usefull to see words statistics, for example, to check how good
>> your dictionaries work or how did you configure pg_ts_cfgmap. Also, you
>> may notice probable stop words relevant for your collection. Tsearch
>> provides stat() function:
>>
>> .......................
>>
>> Don't hesitate to read it and if you find some bugs or know better wording
>> I'd be glad to improve my notes.
>>
>
> Thanks, but that stat() query takes way too long.. I let it run for like
> 4 hours and still nothing. The database I am testing tsearch2 on is also
> the production database (mysql) server so I have to be careful not to
> use too many resources :o

stat() is indeed a bigdog, it was designed for developers needs,
so we recommend to save results in table.

>
> Anyway, here's my pg_ts_cfgmap now (well the relevant bits):
>
> default_english | lhword       | {en_ispell,en_stem}
> default_english | lpart_hword  | {en_ispell,en_stem}
> default_english | lword        | {en_ispell,en_stem}
>
> Is it normal that queries for single words (or perhaps they are words
> that are common) take a really long time? Like this:
>

'hispanic' isn't common, I see you get only  674 rows and
'buckingham & palace'  returns 185 rows. Did you run 'vacuum analyze' ?
I see a big discrepancy between estimated rows (8041) and actual rows.



> ilab=# explain analyze select count(*) from books where description_fti @@
> to_tsquery('default', 'hispanic');
>                                                                QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------
> Aggregate  (cost=20369.81..20369.81 rows=1 width=0) (actual
> time=261512.031..261512.031 rows=1 loops=1)
>   ->  Index Scan using idxfti_idx on books  (cost=0.00..20349.70 rows=8041
> width=0) (actual time=45777.760..261509.288 rows=674 loops=1)
>         Index Cond: (description_fti @@ '\'hispan\''::tsquery)
> Total runtime: 261518.529 ms
> (4 rows)
>
> ilab=# explain analyze select titel from books where description_fti @@
> to_tsquery('default', 'buckingham & palace');
> QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using idxfti_idx on books  (cost=0.00..20349.70 rows=8041
> width=57) (actual time=18992.045..48863.385 rows=185 loops=1)
>   Index Cond: (description_fti @@ '\'buckingham\' & \'palac\''::tsquery)
> Total runtime: 48863.874 ms
> (3 rows)
>
>
> I dont know what happened, these queries were a lot faster 2 days ago..what
> the feck is going on?!
>
> Rick
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

pgsql-performance by date:

Previous
From: Rick Jansen
Date:
Subject: Re: Tsearch2 performance on big database
Next
From: Rick Jansen
Date:
Subject: Re: Tsearch2 performance on big database