FTS query, statistics and planner estimations… - Mailing list pgsql-general

From Pierre Ducroquet
Subject FTS query, statistics and planner estimations…
Date
Msg-id 3135605.Z1OdYFN9ee@laptop-pierred
Whole thread Raw
Responses Re: [GENERAL] FTS query, statistics and planner estimations…  (Francisco Olarte <folarte@peoplecall.com>)
Re: FTS query, statistics and planner estimations…  (Artur Zakirov <a.zakirov@postgrespro.ru>)
List pgsql-general
Hello

I recently stumbled on a slow query in my database that showed an odd
behaviour related to the statistics of FTS queries.
The query does a few joins «after» running a FTS query on a main table.
The FTS query returns a few thousand rows, but the estimations are wrong,
leading the optimizer to terrible plans compared to what should happen, and
thus creates a far higher execution time.
I managed to isolate the odd behaviour in a single query, and I would like
your opinion about it.

I have modified the table name, columns and query to hide sensitive values,
but the issue remain the same. The table contains about 295,000 documents, and
all is running under PostgreSQL 9.5.

EXPLAIN ANALYZE
SELECT COUNT(*)
FROM documents
WHERE
    to_tsvector('french', subject || ' ' || body) @@ plainto_tsquery('XXX');

Of course, there is an index on to_tsvector('french', subject || ' ' || body).

That query gives me the following results for several values of XXX :

 Request                          | Estimated rows | Real rows
----------------------------------+----------------+-----------
'word1'                           | 38050          | 37500
'word1 word2'                     | 4680           | 32000
'word1 word2 word3'               | 270            | 12300
'word1 word2 word3 word4'         | 10             | 9930
'word1 word2 word3 word4 word5'   | 1              | 9930

You can see that with more words in query, the estimation falls far behind
reality.

Is that a known limitation of the FTS indexing ? Am I missing something
obvious, or a poor configuration ?

Thanks a lot

Attachment

pgsql-general by date:

Previous
From: Michael Paquier
Date:
Subject: Re: ENABLE ROW LEVEL SECURITY cause huge produce of checkpoints
Next
From: Francisco Olarte
Date:
Subject: Re: [GENERAL] FTS query, statistics and planner estimations…