Thread: FTS performance issue probably due to wrong planner estimate of detoasting
Hi, I have problems with the performance of FTS in a query like this: SELECT * FROM FullTextSearch WHERE content_tsv_gin @@ plainto_tsquery('english', 'good'); It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped). The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE I have to check again, if I'm doing something wrong but I'm pretty sure it has to do with de-toasting and (wrong?) cost estimations. I've seen some comments here saying that estimating detoasting costs (especially with operator "@@" and GIN index) is an open issue (since years?). And I found a nice blog here [1] which uses 9.2/9.1 and proposes to disable sequential table scan (SET enable_seqscan off;). But this is no option for me since other queries still need seqscan. Can anyone tell me if is on some agenda here (e.g. as an open item for >9.2)? Yours, Stefan [1] http://palominodb.com/blog/2012/03/06/considerations-about-text-searchs-big-fields-and-planner-costs
Re: FTS performance issue probably due to wrong planner estimate of detoasting
From
Pavel Stehule
Date:
Hello you can try to wrap searching to immutable function and use following trick http://postgres.cz/wiki/PostgreSQL_SQL_Tricks#Using_IMMUTABLE_functions_as_hints_for_the_optimizer Regards Pavel Stehule 2013/2/8 Stefan Keller <sfkeller@gmail.com>: > Hi, > > I have problems with the performance of FTS in a query like this: > > SELECT * FROM FullTextSearch WHERE content_tsv_gin @@ > plainto_tsquery('english', 'good'); > > It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped). > The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE > I have to check again, if I'm doing something wrong but I'm pretty > sure it has to do with de-toasting and (wrong?) cost estimations. > > I've seen some comments here saying that estimating detoasting costs > (especially with operator "@@" and GIN index) is an open issue (since > years?). > And I found a nice blog here [1] which uses 9.2/9.1 and proposes to > disable sequential table scan (SET enable_seqscan off;). But this is > no option for me since other queries still need seqscan. > Can anyone tell me if is on some agenda here (e.g. as an open item for >9.2)? > > Yours, Stefan > > [1] http://palominodb.com/blog/2012/03/06/considerations-about-text-searchs-big-fields-and-planner-costs > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Re: FTS performance issue probably due to wrong planner estimate of detoasting
From
Jesper Krogh
Date:
On 08/02/13 01:52, Stefan Keller wrote: > Hi, > > I have problems with the performance of FTS in a query like this: > > SELECT * FROM FullTextSearch WHERE content_tsv_gin @@ > plainto_tsquery('english', 'good'); > > It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped). > The planner obviously always chooses table scan: http://explain.depesz.com/s/EEE > I have to check again, if I'm doing something wrong but I'm pretty > sure it has to do with de-toasting and (wrong?) cost estimations. If you havent done it .. bump up statistics target on the column and re-analyze, see what that gives. I have also been playing with the cost-numbers in order to get it to favour an index-scan more often. That is lowering random_page_cost to be close to seq_page_cost, dependent on your system, the amount of memory, etc, then this can have negative side-effects on non-gin-queries. -- Jesper
Re: FTS performance issue probably due to wrong planner estimate of detoasting
From
Stefan Keller
Date:
Hi Jesper and Pavel Thx for your hints. I'm rather reluctant in tuning with unwanted side effects, We'll see. I have to setup my system and db again before I can try out your tricks. Yours, Stefan 2013/2/8 Jesper Krogh <jesper@krogh.cc>: > On 08/02/13 01:52, Stefan Keller wrote: >> >> Hi, >> >> I have problems with the performance of FTS in a query like this: >> >> SELECT * FROM FullTextSearch WHERE content_tsv_gin @@ >> plainto_tsquery('english', 'good'); >> >> It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB >> zipped). >> The planner obviously always chooses table scan: >> http://explain.depesz.com/s/EEE >> I have to check again, if I'm doing something wrong but I'm pretty >> sure it has to do with de-toasting and (wrong?) cost estimations. > > If you havent done it .. bump up statistics target on the column and > re-analyze, see what that gives. > > I have also been playing with the cost-numbers in order to get it to favour > an index-scan more often. That is lowering random_page_cost to be close to > seq_page_cost, dependent on your system, the amount of memory, etc, then > this can have negative side-effects on non-gin-queries. > > -- > Jesper