Re: Re: [GENERAL] FTS query,statistics and planner estimations… - Mailing list pgsql-general

From Pierre Ducroquet
Subject Re: Re: [GENERAL] FTS query,statistics and planner estimations…
Date
Msg-id 4390595.uOdNxpEeCk@laptop-pierred
Whole thread Raw
In response to Re: [GENERAL] Re: [GENERAL] FTS query, statistics and planner estimations…  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-general
On Wednesday, November 9, 2016 10:51:11 AM CET Pavel Stehule wrote:
> 2016-11-09 10:40 GMT+01:00 Francisco Olarte <folarte@peoplecall.com>:
> > Pierre:
> >
> > On Wed, Nov 9, 2016 at 10:22 AM, Pierre Ducroquet
> >
> > <pierre.ducroquet@people-doc.com> wrote:
> > > The query does a few joins «after» running a FTS query on a main table.
> > > The FTS query returns a few thousand rows, but the estimations are
> > > wrong,
> > > leading the optimizer to terrible plans compared to what should happen,
> >
> > and
> >
> > > thus creates a far higher execution time.
> >
> > ....
> >
> > > but the issue remain the same. The table contains about 295,000
> >
> > documents, and
> > ....
> >
> > >  Request                          | Estimated rows | Real rows
> > >
> > > ----------------------------------+----------------+-----------
> > > 'word1'                           | 38050          | 37500
> > > 'word1 word2'                     | 4680           | 32000
> > > 'word1 word2 word3'               | 270            | 12300
> > > 'word1 word2 word3 word4'         | 10             | 9930
> > > 'word1 word2 word3 word4 word5'   | 1              | 9930
> > >
> > > You can see that with more words in query, the estimation falls far
> >
> > behind
> >
> > > reality.
> >
> > I'm not really familiar with FTS but, doing a few division of
> > estimations and rows it seems it estimates as uncorrelated words, and
> > you real rows clearly indicate some of them are clearly correlated (
> > like w1/w2 and w4/s5, and partially w3/w45 ) and very common.
> >
> > > Is that a known limitation of the FTS indexing ? Am I missing something
> > > obvious, or a poor configuration ?
> >
> > Someone more familiar with it needed for that, but what I've found
> > several times is FTS does not mix too well with relational queries at
> > the optimizer level ( as FTS terms can have very diverse degrees of
> > correlation, which is very difficult to store in the statistics a
> > relational optimizer normally uses ).
>
> there is workaround - the FTS query can be wrapped to immutable function -
> then can be executed in planner time, and the estimations can be better
>
>
> http://postgres.cz/wiki/PostgreSQL_SQL_Tricks_II#Using_IMMUTABLE_functions_a
> s_hints_for_the_optimizer

My bad, I used the plainto_tsquery(configuration, query) function that is
immutable, so the planner knows what is being searched.

Attachment

pgsql-general by date:

Previous
From: Pierre Ducroquet
Date:
Subject: Re: FTS query, statistics and planner estimations…
Next
From: Pierre Ducroquet
Date:
Subject: Re: FTS query, statistics and planner estimations…