Re: tsearch2 poor performance - Mailing list pgsql-admin
From | Kris Kiger |
---|---|
Subject | Re: tsearch2 poor performance |
Date | |
Msg-id | 415498AE.40000@musicrebellion.com Whole thread Raw |
In response to | tsearch2 poor performance (Kris Kiger <kris@musicrebellion.com>) |
List | pgsql-admin |
Here is the explain analyze output, funny thing, after I ran josh's query, mine ran a lot faster....maybe it forced a caching?; search_test=# explain analyze select count(*) from product where vector @@ to_tsquery('oil'); Aggregate (cost=6113.09..6113.09 rows=1 width=0) (actual time=19643.372..19643.376 rows=1 loops=1) -> Index Scan using vector_idx on product (cost=0.00..6105.58 rows=3001 width=0) (actual time=0.381..18145.917 rows=226357 loops=1) Index Cond: (vector @@ '\'oil\''::tsquery) Filter: (vector @@ '\'oil\''::tsquery) Total runtime: 19643.597 ms search_test=# explain analyze select count(*) from product where vector @@ to_tsquery('hydrogen'); Aggregate (cost=6113.09..6113.09 rows=1 width=0) (actual time=19629.766..19629.769 rows=1 loops=1) -> Index Scan using vector_idx on product (cost=0.00..6105.58 rows=3001 width=0) (actual time=0.378..18127.573 rows=226868 loops=1) Index Cond: (vector @@ '\'hydrogen\''::tsquery) Filter: (vector @@ '\'hydrogen\''::tsquery) Total runtime: 19629.992 ms Here is Josh's; search_test=# explain analyze SELECT count(q) FROM product, to_tsquery('oil') AS q WHERE vector @@ q; Aggregate (cost=6150597.03..6150597.03 rows=1 width=32) (actual time=21769.526..21769.530 rows=1 loops=1) -> Nested Loop (cost=0.00..6143097.02 rows=3000001 width=32) (actual time=0.424..20450.208 rows=226357 loops=1) -> Function Scan on q (cost=0.00..12.50 rows=1000 width=32) (actual time=0.023..0.031 rows=1 loops=1) -> Index Scan using vector_idx on product (cost=0.00..6105.58 rows=3000 width=32) (actual time=0.376..18165.415 rows=226357 loops=1) Index Cond: (product.vector @@ "outer".q) Filter: (product.vector @@ "outer".q) Total runtime: 21769.786 ms Disabling Index usage slowed it down: search_test=# explain analyze select count(*) from product where vector @@ to_tsquery('hydrogen'); Aggregate (cost=347259.51..347259.51 rows=1 width=0) (actual time=24675.933..24675.936 rows=1 loops=1) -> Seq Scan on product (cost=0.00..347252.00 rows=3001 width=0) (actual time=0.320..23164.492 rows=226868 loops=1) Filter: (vector @@ '\'hydrogen\''::tsquery) Total runtime: 24676.091 ms Time: 24678.842 ms search_test=# explain analyze SELECT count(q) FROM product, to_tsquery('oil') AS q WHERE vector @@ q; Aggregate (cost=67847264.50..67847264.50 rows=1 width=32) (actual time=83631.201..83631.204 rows=1 loops=1) -> Nested Loop (cost=12.50..67839764.50 rows=3000001 width=32) (actual time=0.214..82294.710 rows=226357 loops=1) Join Filter: ("outer".vector @@ "inner".q) -> Seq Scan on product (cost=0.00..339752.00 rows=3000000 width=32) (actual time=0.107..27563.952 rows=3000000 loops=1) -> Materialize (cost=12.50..22.50 rows=1000 width=32) (actual time=0.003..0.006 rows=1 loops=3000000) -> Function Scan on q (cost=0.00..12.50 rows=1000 width=32) (actual time=0.019..0.023 rows=1 loops=1) Total runtime: 83631.385 ms Here are the results of stat: search_test=# select * from stat('select vector from product') order by ndoc desc, nentry; word | ndoc | nentry --------------+---------+--------- anoth | 1187386 | 1477442 bear | 696668 | 780963 take | 675319 | 736410 relat | 491469 | 528259 toward | 490653 | 528369 defin | 490572 | 527099 live | 490538 | 527401 beyond | 490124 | 527957 behind | 490087 | 527735 insid | 489530 | 527074 near | 489504 | 527721 around | 489244 | 526870 mean | 478201 | 512699 complex | 440339 | 468669 light | 438685 | 468140 ball | 438567 | 468168 pit | 438293 | 467807 dress | 438128 | 467260 player | 437633 | 466753 secret | 433279 | 457246 love | 423777 | 442694 give | 423691 | 441305 need | 423336 | 434409 peopl | 423336 | 434409 believ | 423336 | 434409 rememb | 423336 | 434409 howev | 421762 | 434194 real | 419906 | 435074 furthermor | 416672 | 434413 indic | 416508 | 434919 exampl | 416508 | 434919 alway | 415543 | 432861 sometim | 415543 | 432861 see | 410706 | 434586 inde | 408379 | 434283 fruit | 363203 | 381862 cook | 362674 | 381112 graduat | 362444 | 381284 chees | 362358 | 381040 hesit | 307431 | 317550 self | 301001 | 312312 hard | 300138 | 310167 spirit | 299310 | 312092 know | 298246 | 309010 laugh | 294136 | 302392 make | 287633 | 295003 find | 287550 | 294770 goe | 279336 | 287025 team | 228000 | 234703 footbal | 228000 | 234703 void | 227914 | 234681 formless | 227914 | 234681 board | 227907 | 234797 chess | 227907 | 234797 submarin | 227869 | 234727 inferior | 227858 | 234357 viper | 227855 | 234865 cylind | 227847 | 234505 suit | 227822 | 234376 class | 227822 | 234376 action | 227822 | 234376 diskett | 227802 | 234786 roller | 227792 | 234524 coaster | 227792 | 234524 mate | 227785 | 234431 ritual | 227785 | 234431 engin | 227784 | 234575 steam | 227784 | 234575 industri | 227780 | 234312 fire | 227775 | 234532 hydrant | 227775 | 234532 briar | 227769 | 234524 patch | 227769 | 234524 mastadon | 227677 | 234665 defend | 227617 | 234410 blade | 227603 | 234356 razor | 227603 | 234356 cab | 227578 | 234554 driver | 227578 | 234554 cough | 227570 | 234324 syrup | 227570 | 234324 cowboy | 227566 | 234663 chop | 227564 | 234437 pork | 227564 | 234437 ceo | 227557 | 234760 rattlesnak | 227554 | 234323 hell | 227540 | 234313 flavor | 227540 | 234313 maelstrom | 227537 | 234404 mulch | 227531 | 234311 cyprus | 227531 | 234311 tack | 227525 | 234462 carpet | 227525 | 234462 movi | 227505 | 234207 theater | 227505 | 234207 spider | 227466 | 234524 cone | 227463 | 234198 pine | 227463 | 234198 garbag | 227459 | 234207 beer | 227443 | 234077 bottl | 227443 | 234077 polygon | 227438 | 234267 judg | 227425 | 234565 blith | 227409 | 233979 traffic | 227403 | 234051 paper | 227397 | 234028 napkin | 227397 | 234028 apart | 227393 | 233911 build | 227393 | 233911 cocker | 227368 | 233926 spaniel | 227368 | 233926 bay | 227358 | 234261 cargo | 227358 | 234261 order | 227357 | 233885 short | 227357 | 233885 polar | 227326 | 234118 demon | 227324 | 234442 minivan | 227317 | 234292 bulb | 227314 | 234089 fundrais | 227308 | 234235 eggplant | 227306 | 234202 cake | 227299 | 234075 bowl | 227299 | 234110 paycheck | 227295 | 234224 sheriff | 227292 | 234313 turkey | 227271 | 234267 turn | 227265 | 234210 signal | 227265 | 234210 chestnut | 227250 | 234104 hole | 227239 | 233975 puncher | 227239 | 233975 tabloid | 227238 | 234341 microscop | 227236 | 234067 reclin | 227234 | 233946 dolphin | 227231 | 234080 pen | 227222 | 234269 pig | 227222 | 234269 wed | 227221 | 233860 bullfrog | 227211 | 234144 truck | 227208 | 233980 pickup | 227208 | 233980 agent | 227201 | 233840 insur | 227201 | 233840 girl | 227201 | 233934 scout | 227201 | 233934 drill | 227200 | 233986 power | 227200 | 233986 ocean | 227187 | 234211 case | 227173 | 233983 crank | 227173 | 233983 squid | 227169 | 234056 senat | 227167 | 234147 fraction | 227161 | 234065 custom | 227152 | 234128 burglar | 227148 | 234014 grizzli | 227133 | 233955 wheel | 227122 | 233813 asteroid | 227108 | 233928 anomali | 227106 | 234156 acceler | 227103 | 233428 particl | 227103 | 233428 saw | 227082 | 233934 chain | 227082 | 233934 reactor | 227035 | 234061 wedg | 227033 | 234143 photon | 227029 | 234017 deficit | 227029 | 234102 vacuum | 227021 | 233760 cleaner | 227021 | 233760 cashier | 227010 | 233858 scyth | 227001 | 233928 cloud | 226981 | 233569 format | 226981 | 233569 tornado | 226968 | 234058 grand | 226936 | 233730 piano | 226936 | 233730 tripod | 226930 | 233755 tomato | 226928 | 233915 sandwich | 226923 | 233786 earring | 226912 | 233665 train | 226912 | 233712 freight | 226912 | 233712 skyscrap | 226901 | 233755 abstract | 226890 | 233658 mortician | 226883 | 233781 warranti | 226876 | 233935 atom | 226868 | 233467 hydrogen | 226868 | 233467 satellit | 226866 | 233680 corpor | 226858 | 233818 globul | 226853 | 233980 cow | 226832 | 233808 jersey | 226832 | 233808 salad | 226830 | 233400 buzzard | 226804 | 233825 lot | 226794 | 233643 park | 226794 | 233643 prime | 226793 | 233325 minist | 226793 | 233325 clot | 226780 | 233380 blood | 226780 | 233380 tuba | 226765 | 233575 tape | 226749 | 233388 record | 226749 | 233388 line | 226747 | 233574 dancer | 226747 | 233574 nation | 226736 | 233796 bartend | 226653 | 233422 hockey | 226645 | 233178 canyon | 226617 | 233699 ski | 226610 | 233451 lodg | 226610 | 233451 stovepip | 226608 | 233489 crane | 226590 | 233578 sand | 226572 | 233270 grain | 226572 | 233270 dust | 226570 | 233391 bunni | 226570 | 233391 lover | 226564 | 233628 fairi | 226554 | 233743 plaintiff | 226537 | 233563 wheelbarrow | 226520 | 233206 food | 226445 | 233228 stamp | 226445 | 233228 umbrella | 226380 | 233273 avocado | 226375 | 232942 oil | 226357 | 233266 filter | 226357 | 233266 financi | 220105 | 225116 complet | 162829 | 164065 ridicul | 162346 | 163592 handl | 162200 | 163390 singl | 162200 | 163390 single-handl | 162200 | 163390 greedili | 162123 | 163379 careless | 162009 | 163193 somewhat | 161979 | 163205 accur | 161975 | 163228 overwhelm | 161946 | 163107 usual | 161930 | 163158 ostens | 161826 | 163020 lazili | 161809 | 163133 slyli | 161803 | 163149 underhand | 161751 | 162955 non | 161585 | 162823 chalant | 161585 | 162823 non-chal | 161585 | 162823 seldom | 161525 | 162739 accident | 161511 | 162676 almost | 161508 | 162782 often | 161488 | 162733 bare | 161401 | 162659 eager | 161278 | 162513 wise | 161073 | 162341 inexor | 161042 | 162265 feverish | 160805 | 162020 thorough | 160611 | 161823 home | 154672 | 155766 return | 154672 | 155766 lost | 154655 | 155567 glori | 154655 | 155567 start | 154655 | 155567 reminisc | 154655 | 155567 rumin | 154577 | 155776 read | 154529 | 155642 magazin | 154529 | 155642 pray | 154478 | 155748 floor | 154396 | 155477 sweep | 154396 | 155477 nag | 154271 | 155259 feel | 154271 | 155259 remors | 154271 | 155259 procrastin | 154256 | 155371 wake | 154220 | 155397 sleep | 154217 | 155353 panic | 154189 | 155346 get | 154168 | 155253 drunk | 154168 | 155253 stink | 154168 | 155253 hibern | 154158 | 155358 die | 153973 | 155223 fli | 153943 | 155056 rage | 153943 | 155056 flagel | 153916 | 155067 self-flagel | 153916 | 155067 daydream | 153864 | 155043 medit | 153816 | 154935 ceas | 153735 | 154815 exist | 153735 | 154815 joy | 153672 | 154754 beam | 153672 | 154754 trembl | 153656 | 154799 loud | 153635 | 154665 hide | 153592 | 154797 break | 153559 | 154599 coffe | 153559 | 154599 earn | 153538 | 154540 mile | 153538 | 154540 flier | 153538 | 154540 frequent | 153538 | 154540 leav | 153535 | 154730 rejoic | 153226 | 154412 sell | 147231 | 148103 plan | 147046 | 147809 escap | 147046 | 147809 throw | 146973 | 147764 negoti | 146905 | 147704 prenupti | 146905 | 147704 agreement | 146905 | 147704 card | 146892 | 147731 trade | 146892 | 147731 basebal | 146892 | 147731 oper | 146888 | 147787 small | 146888 | 147787 stand | 146888 | 147787 drink | 146881 | 147727 night | 146881 | 147727 steal | 146835 | 147847 pencil | 146835 | 147847 seek | 146816 | 148029 figur | 146801 | 147908 write | 146736 | 147720 letter | 146736 | 147720 recogn | 146723 | 147823 truce | 146684 | 147630 eat | 146670 | 147874 compet | 146647 | 147760 buy | 146642 | 147522 gift | 146642 | 147522 expens | 146642 | 147522 big | 146626 | 147717 fan | 146626 | 147717 fall | 146597 | 147601 assist | 146587 | 147589 requir | 146587 | 147589 chang | 146542 | 147479 heart | 146542 | 147479 conquer | 146542 | 147695 money | 146481 | 147450 borrow | 146481 | 147450 ignor | 146475 | 147643 share | 146415 | 147404 shower | 146415 | 147404 fault | 146413 | 147361 subtl | 146413 | 147361 kind | 146402 | 147492 great | 146397 | 147367 upon | 146396 | 147366 honor | 146396 | 147366 bestow | 146396 | 147366 pee | 146394 | 147477 avoid | 146392 | 147388 contact | 146392 | 147388 pink | 146372 | 147347 slip | 146372 | 147347 aid | 146367 | 147225 teach | 146366 | 147516 sanit | 146361 | 147477 lice | 146360 | 147409 buri | 146360 | 147483 cold | 146357 | 147220 pour | 146357 | 147220 freez | 146357 | 147220 water | 146357 | 147220 sea | 146347 | 147217 deep | 146347 | 147217 fish | 146347 | 147217 organ | 146321 | 147476 grit | 146289 | 147227 satiat | 146251 | 147349 assimil | 146251 | 147377 tri | 146188 | 147200 seduc | 146188 | 147200 reach | 146132 | 147008 understand | 146132 | 147008 brainwash | 146068 | 147158 admir | 146050 | 147021 caricatur | 145989 | 147107 deriv | 145941 | 146790 pervers | 145941 | 146790 satisfact | 145941 | 146790 moral | 145854 | 146733 lectur | 145854 | 146733 befriend | 145799 | 146963 learn | 145758 | 146666 lesson | 145758 | 146666 play | 145738 | 146706 pinochl | 145738 | 146706 peek | 145698 | 146737 danc | 145555 | 146637 fashion | 78762 | 79203 muddi | 78750 | 79236 hypnot | 78747 | 79204 childlik | 78579 | 79002 loyal | 78575 | 79056 mysteri | 78554 | 79047 annoy | 78532 | 79032 slow | 78517 | 78996 twist | 78515 | 79016 unstabl | 78510 | 78945 feder | 78501 | 78967 rever | 78501 | 79008 wrinkl | 78495 | 78965 rude | 78495 | 78975 boil | 78493 | 78972 high | 78481 | 78940 paid | 78481 | 78940 geosynchron | 78478 | 78931 greasi | 78476 | 78961 cosmopolitan | 78459 | 78903 fat | 78438 | 78935 inciner | 78429 | 78896 dot | 78426 | 78864 polka | 78426 | 78864 polka-dot | 78426 | 78864 outer | 78415 | 78910 phoni | 78411 | 78895 pathet | 78405 | 78869 purpl | 78405 | 78895 frozen | 78403 | 78886 nearest | 78396 | 78879 statesmanlik | 78386 | 78830 dirt | 78376 | 78828 encrust | 78376 | 78828 dirt-encrust | 78376 | 78828 sur | 78371 | 78895 obsequi | 78369 | 78805 salti | 78360 | 78834 imagin | 78356 | 78808 south | 78325 | 78787 american | 78325 | 78787 load | 78318 | 78832 righteous | 78282 | 78760 fractur | 78281 | 78737 educ | 78278 | 78682 colleg | 78278 | 78682 college-educ | 78278 | 78682 mitochondri | 78269 | 78745 treacher | 78265 | 78697 spartan | 78252 | 78707 felin | 78244 | 78713 ravish | 78242 | 78765 patern | 78241 | 78701 psychot | 78238 | 78693 shabbi | 78228 | 78685 dreamlik | 78224 | 78642 loath | 78221 | 78653 self-loath | 78221 | 78653 world | 78203 | 78658 call | 78183 | 78610 so-cal | 78183 | 78610 radioact | 78182 | 78623 alleg | 78178 | 78664 cantanker | 78159 | 78620 makeshift | 78159 | 78648 gentl | 78156 | 78609 fri | 78143 | 78648 linguist | 78141 | 78586 overrip | 78134 | 78572 varig | 78132 | 78609 vapor | 78105 | 78548 impromptu | 78104 | 78569 actual | 78104 | 78592 self-actu | 78104 | 78592 frighten | 78100 | 78544 molten | 78100 | 78567 gratifi | 78098 | 78528 bur | 78094 | 78563 hairi | 78092 | 78563 foreign | 78083 | 78569 tatter | 78050 | 78518 frustrat | 78044 | 78474 stoic | 78036 | 78503 eurasian | 78033 | 78513 proverbi | 78031 | 78519 green | 78024 | 78450 skinni | 78023 | 78524 familiar | 78016 | 78477 optim | 78006 | 78483 bohemian | 78002 | 78500 overpr | 77983 | 78411 pompous | 77955 | 78460 difficult | 77938 | 78375 raspi | 77924 | 78461 soggi | 77912 | 78381 resplend | 77910 | 78351 blotch | 77910 | 78380 fals | 77908 | 78409 infect | 77907 | 78399 magnific | 77898 | 78350 snooti | 77897 | 78422 moron | 77886 | 78362 moldi | 77865 | 78370 precis | 77860 | 78331 crispi | 77856 | 78324 smelli | 77813 | 78279 tempor | 77810 | 78244 alaskan | 77808 | 78258 elus | 77775 | 78245 miser | 77772 | 78232 flatul | 77761 | 78201 orbit | 77723 | 78157 mean-spirit | 77660 | 78113 flabbi | 77649 | 78110 nuclear | 77609 | 78069 go | 15532 | 15545 made | 1 | 1 america | 1 | 1 If you need anything else, let me know! Kris Oleg Bartunov wrote: >Kris, > >could you post 'explain analyze' output ? >Also, could you disable index usage (set enable_indexscan=off) >and rerun search using tsearch2 ? > >also, could you run 'stat' function to see frequency distribution >of words. See http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes >for details. > >Oleg > > >>Hi all. I am doing some work with tsearch2 and am not sure what to >>expect out of it, performance wise. Here is my setup: >> >> Table "public.product" >> Column | Type | Modifiers >>-------------+----------+------------------------------------------------- >> description | text | >> product_id | integer | default nextval('product_product_id_seq'::text) >> vector | tsvector | >>Indexes: >> "vector_idx" gist (vector) >>Triggers: >> tsvectorupdate BEFORE INSERT OR UPDATE ON product FOR EACH ROW EXECUTE PROCEDURE tsearch2('vector', 'description') >> >>This table has 3,000,000 rows in it. Each description field has roughly 50 characters. There are fewer than ten thousanddistinct words in my 3,000,000 rows. The vector was filled using the description fields values. I ran a vacuumfull analyze before executing any of my queries. >> >>Here are a couple of tests I performed using the tsearch index and like; >> >>search_test=# select count(*) from product where vector @@ to_tsquery('oil'); >> count >>-------- >> 226357 >>(1 row) >> >>Time: 191056.230 ms >> >>search_test=# select count(*) from product where vector @@ to_tsquery('hydrogen'); >> count >>-------- >> 226868 >>(1 row) >> >>Time: 306411.957 ms >> >>search_test=# select count(*) from product where description like '% oil %'; >> count >>-------- >> 226357 >>(1 row) >> >>Time: 38426.851 ms >> >>search_test=# select count(*) from product where description like '% hydrogen %'; >> count >>-------- >> 226868 >>(1 row) >> >>Time: 38265.421 ms >> >> >>Both of the likes are using a sequential scan and both of the tsearch queries use the gist index. Did I miss a configurationparameter, are these queries incorrectly using tsearch2,or is this tsearch2's average performance? Thanks inadvance for the input! >> >>Kris >> >>
pgsql-admin by date: