Re: FTI is really really slow; what am I doing wrong? - Mailing list pgsql-general
From | newsreader@mediaone.net |
---|---|
Subject | Re: FTI is really really slow; what am I doing wrong? |
Date | |
Msg-id | 20010822110300.A21186@dragon.universe Whole thread Raw |
In response to | FTI is really really slow; what am I doing wrong? ("Paul C." <ulive1x@hotmail.com>) |
List | pgsql-general |
Did you vacuum after populating the tables? If not you should do it On Wed, Aug 22, 2001 at 11:08:55AM -0400, Paul C. wrote: > Greetings, > I am trying to test out the performance of the contrib/fulltextindex > package and I am getting horrid performance results. > The Setup: > I created a simple table, ST (id SERIAL, body varchar(1024), which is to be > searched. I created the ST_FTI table, trigger and indices as per > instructions in the FTI readme and C file. To populate the table, I took a > flat text version of 'War and Peace' I found on the net, broke it up into > sentences and inserted each sentence into ST as a row. So I have about > 38,000 sentences and my ST_FTI table is about 2 million rows. > The Test: > There is exactly one sentence (row) that has the strings 'Newton' and > 'Kepler' in it. That is my target. For a straight select on ST: > select * from st where body ~* 'newton' and body ~* 'kepler'; > the cost is 1100.41 > BUT for an query using the FTI indices: > select s.* from st s, st_fti f1, st_fti f2 where f1.string > ~ '^kepler' and f2.string ~ '^newton' and s.oid = f1.id > and s.oid = f2.id; > the cost becomes a staggering 80628.92!!! The plans are pasted at the end > of this message. > Now, I have all the indices created (on id of st_fti, on string of st_fti > and on oid of st). I cannot figure out why this is so much worse than the > straight query. Indeed, the cost to look up a single string in the st_fti > table is way high: > select * from st_fti where string ~ '^kepler'; > costs 36703.40, AND its doing a Seq Scan on st_fti, even though an index > exists. > What am I doing wrong? Is it the sheer size of the st_fti table that is > causing problems? Any help would be greatly appreciated. > Thanks, > Paul C. > > FTI search > NOTICE: QUERY PLAN: > Merge Join (cost=80046.91..80628.92 rows=110 width=28) > -> Sort (cost=41827.54..41827.54 rows=19400 width=24) > -> Hash Join (cost=1992.80..40216.39 rows=19400 width=24) > -> Seq Scan on st_fti f2 (cost=0.00..36703.40 rows=19400 > width=4) > -> Hash (cost=929.94..929.94 rows=34094 width=20) > -> Seq Scan on st s (cost=0.00..929.94 rows=34094 > width=20) > -> Sort (cost=38219.37..38219.37 rows=19400 width=4) > -> Seq Scan on st_fti f1 (cost=0.00..36703.40 rows=19400 width=4) > EXPLAIN > > Plain search: > NOTICE: QUERY PLAN: > Seq Scan on st (cost=0.00..1100.41 rows=1 width=16) > EXPLAIN > > > _________________________________________________________________ > Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
pgsql-general by date: