Re: FTI is really really slow; what am I doing wrong? - Mailing list pgsql-general

From newsreader@mediaone.net
Subject Re: FTI is really really slow; what am I doing wrong?
Date
Msg-id 20010822110300.A21186@dragon.universe
Whole thread Raw
In response to FTI is really really slow; what am I doing wrong?  ("Paul C." <ulive1x@hotmail.com>)
List pgsql-general
Did you vacuum after
populating the tables?
If not you should do it


On Wed, Aug 22, 2001 at 11:08:55AM -0400, Paul C. wrote:
> Greetings,
>   I am trying to test out the performance of the contrib/fulltextindex
> package and I am getting horrid performance results.
> The Setup:
> I created a simple table, ST (id SERIAL, body varchar(1024), which is to be
> searched.  I created the ST_FTI table, trigger and indices as per
> instructions in the FTI readme and C file.  To populate the table, I took a
> flat text version of 'War and Peace' I found on the net, broke it up into
> sentences and inserted each sentence into ST as a row.  So I have about
> 38,000 sentences and my ST_FTI table is about 2 million rows.
> The Test:
> There is exactly one sentence (row) that has the strings 'Newton' and
> 'Kepler' in it.  That is my target.  For a straight select on ST:
>   select * from st where body ~* 'newton' and body ~* 'kepler';
> the cost is 1100.41
> BUT for an query using the FTI indices:
>   select s.* from st s, st_fti f1, st_fti f2 where f1.string
>     ~ '^kepler' and f2.string ~ '^newton' and s.oid = f1.id
>     and s.oid = f2.id;
> the cost becomes a staggering 80628.92!!!  The plans are pasted at the end
> of this message.
> Now, I have all the indices created (on id of st_fti, on string of st_fti
> and on oid of st).  I cannot figure out why this is so much worse than the
> straight query.  Indeed, the cost to look up a single string in the st_fti
> table is way high:
>   select * from st_fti where string ~ '^kepler';
> costs 36703.40, AND its doing a Seq Scan on st_fti, even though an index
> exists.
> What am I doing wrong?  Is it the sheer size of the st_fti table that is
> causing problems?  Any help would be greatly appreciated.
> Thanks,
> Paul C.
>
> FTI search
> NOTICE:  QUERY PLAN:
> Merge Join  (cost=80046.91..80628.92 rows=110 width=28)
>   ->  Sort  (cost=41827.54..41827.54 rows=19400 width=24)
>         ->  Hash Join  (cost=1992.80..40216.39 rows=19400 width=24)
>               ->  Seq Scan on st_fti f2  (cost=0.00..36703.40 rows=19400
> width=4)
>               ->  Hash  (cost=929.94..929.94 rows=34094 width=20)
>                     ->  Seq Scan on st s  (cost=0.00..929.94 rows=34094
> width=20)
>   ->  Sort  (cost=38219.37..38219.37 rows=19400 width=4)
>         ->  Seq Scan on st_fti f1  (cost=0.00..36703.40 rows=19400 width=4)
> EXPLAIN
>
> Plain search:
> NOTICE:  QUERY PLAN:
> Seq Scan on st  (cost=0.00..1100.41 rows=1 width=16)
> EXPLAIN
>
>
> _________________________________________________________________
> Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

pgsql-general by date:

Previous
From: Lamar Owen
Date:
Subject: Re: New RPMS ?
Next
From: "Mitch Vincent"
Date:
Subject: Re: maximum query length