Home > mailing lists

Re: Full Text Search dictionary issues - Mailing list pgsql-general

From	Ivan Voras
Subject	Re: Full Text Search dictionary issues
Date	July 16, 2010 07:23:09
Msg-id	i1pbtr$pej$1@dough.gmane.org Whole thread Raw
In response to	Re: Full Text Search dictionary issues (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-general

Tree view

On 07/16/10 02:23, Tom Lane wrote:
> Howard Rogers <hjr@diznix.com> writes:
>> I have 10 million rows in a table, with full text index created on one
>> of the columns. I submit this query:
>
>> ims=# select count(*) from search_rm
>> ims-# where to_tsvector('english', textsearch)
>> ims-# @@ to_tsquery('english', 'woman & beach & ball');
>>  count
>> -------
>>    646
>> (1 row)
>> Time: 107.570 ms
>
>> ...and those are excellent times. But if I alter the query to read:
>
>> ims=# select count(*) from search_rm
>> where to_tsvector('english', textsearch)
>> @@ to_tsquery('english', 'woman & beach & ftx1');
>>  count
>> -------
>>  38343
>> (1 row)
>> Time: 640.985 ms
>
>> ...then, as you see, it slows the query down by a factor of about 6,
>
> ... um, but it increased the number of matching rows by a factor of
> almost 60.  I think your complaint of poor scaling is misplaced.

This is basically the same question I asked a few days ago and I think
the reason for this (mis)expectation of performance comes from expecting
tsearch2 to behave like external specialized indexers. In such products,
the search result can be returned simply from the index, which can scale
fairly well, but PostgreSQL actually has to lookup all the records
returned and this is where most time is spent.

pgsql-general by date:

From: Thomas Kellerer
Date: 16 July 2010, 05:36:19
Subject: pg_dump and --inserts / --column-inserts

From: Ivan Sergio Borgonovo
Date: 16 July 2010, 08:21:57
Subject: resource management, letting user A use no more than X resource (time, CPU, memory...)

Re: Full Text Search dictionary issues - Mailing list pgsql-general

Previous

Next