Re: Tsearch2 performance on big database - Mailing list pgsql-performance
From | Rick Jansen |
---|---|
Subject | Re: Tsearch2 performance on big database |
Date | |
Msg-id | 42404263.40604@rockingstone.nl Whole thread Raw |
In response to | Re: Tsearch2 performance on big database (Oleg Bartunov <oleg@sai.msu.su>) |
Responses |
Re: Tsearch2 performance on big database
|
List | pgsql-performance |
Oleg Bartunov wrote: > Mike, > > no comments before Rick post tsearch configs and increased buffers ! > Union shouldn't be faster than (term1|term2). > tsearch2 internals description might help you understanding tsearch2 > limitations. > See http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals > Also, don't miss my notes: > http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes > > Oleg Thanks Oleg, i've seen those pages before :) I've set shared_buffers to 45000 now (yes thats probably very much, isn't it?) and it already seems a lot quicker. How do I find out what my tsearch config is? I followed the intro (http://www.sai.msu.su/~megera/oddmuse/index.cgi/tsearch-v2-intro) and applied it to our books table, thats all, didnt change anything else about configs. > On Tue, 22 Mar 2005, Mike Rylander wrote: >> Ahh... I should have qualified my claim. I am creating a google-esqe >> search interface and almost every query uses '&' as the term joiner. >> 'AND' queries and one-term queries are orders of magnitude faster than >> 'OR' queries, and fortunately are the expected default for most users. >> (Think, "I typed in these words, therefore I want to match these >> words"...) An interesting test may be to time multiple queries >> independently, one for each search term, and see if the combined cost >> is less than a single 'OR' search. If so, you could use UNION to join >> the results. Well I just asked my colleges and OR queries arent used by us anyway, so I'll test for AND queries instead. >> However, the example you originally gave ('terminology') should be >> very fast. On a comparable query ("select count(value) from >> metabib.full_rec where index_vector @@ to_tsquery('default','jane');") >> I get 12ms. ilab=# select count(*) from books where idxfti @@ to_tsquery('default', 'jane'); count ------- 4093 (1 row) Time: 217395.820 ms :( ilab=# explain analyze select count(*) from books where idxfti @@ to_tsquery('default', 'jane'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=16591.95..16591.95 rows=1 width=0) (actual time=4634.931..4634.932 rows=1 loops=1) -> Index Scan using idxfti_idx on books (cost=0.00..16581.69 rows=4102 width=0) (actual time=0.395..4631.454 rows=4093 loops=1) Index Cond: (idxfti @@ '\'jane\''::tsquery) Total runtime: 4635.023 ms (4 rows) Time: 4636.028 ms ilab=# >> 16G of RAM on a dedicated machine. >> >> >> shared_buffers = 15000 # min 16, at least max_connections*2, >> 8KB each >> work_mem = 10240 # min 64, size in KB >> maintenance_work_mem = 1000000 # min 1024, size in KB >> # big m_w_m for loading data... >> >> random_page_cost = 2.5 # units are one sequential page fetch >> cost >> # fast drives, and tons of RAM >> Right.. well I'll try copying these settings, see how that works out, thanks :) Rick -- Systems Administrator for Rockingstone IT http://www.rockingstone.com http://www.megabooksearch.com - Search many book listing sites at once
pgsql-performance by date: