Wrong number of rows estimation by the planner - Mailing list pgsql-performance

From Yonatan Ben-Nes
Subject Wrong number of rows estimation by the planner
Date
Msg-id 2d0127b80802080412o741fa5a1p3a3ae241e21574f7@mail.gmail.com
Whole thread Raw
List pgsql-performance
Hi all,

When I'm doing an explain analyze to a query of mine I notice that the number of estimated rows by the planner is a lot smaller then the actual number of rows, I'm afraid that this make my queries slower.

A query for example is:

EXPLAIN ANALYZE
SELECT product_id,product_name
FROM product
WHERE product_keywords_vector @@ plainto_tsquery('default', 'black') AND rank(product_keywords_vector, plainto_tsquery('default', 'black')) > 0.4 AND
product_status = TRUE AND product_type = 'comparison'
ORDER BY ((product_buy_number * 4) + product_view_number + 1) * rank(product_keywords_vector, plainto_tsquery('default', 'black')) DESC;
                                                                         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=10543.67..10544.81 rows=455 width=297) (actual time=1098.188..1104.606 rows=22248 loops=1)
   Sort Key: (((((product_buy_number * 4) + product_view_number) + 1))::double precision * rank(product_keywords_vector, '''black'''::tsquery))
   ->  Bitmap Heap Scan on product  (cost=287.13..10523.59 rows=455 width=297) (actual time=50.496..1071.900 rows=22248 loops=1)
         Recheck Cond: (product_keywords_vector @@ '''black'''::tsquery)
         Filter: ((rank(product_keywords_vector, '''black'''::tsquery) > 0.4::double precision) AND product_status AND (product_type = 'comparison'::text))
         ->  Bitmap Index Scan on product_product_keywords_vector  (cost=0.00..287.02 rows=2688 width=0) (actual time=26.385..26.385 rows=72507 loops=1)
               Index Cond: (product_keywords_vector @@ '''black'''::tsquery)
 Total runtime: 1111.507 ms
(8 rows)

Here as I understand it, at the Bitmap Index Scan on product_product_keywords_vector the planner estimate that it will retrieve 2688 rows but it actually retrieve 72507 rows and later at the Bitmap Heap Scan on product it estimate 455 rows and retrieve 22248 rows.

I increased the statistics of the field which the product_product_keywords_vector index is built on by doing:
ALTER TABLE product ALTER COLUMN product_keywords_vector SET STATISTICS 1000;
ANALYZE;
REINDEX INDEX product_product_keywords_vector;

But it didn't change a thing.

Any ideas?

Thanks in advance,
Yonatan Ben-Nes




pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: index usage on arrays
Next
From: Mark Wong
Date:
Subject: Re: Benchmark Data requested