Re: Inaccurate Rows estimate for "Bitmap And" causes Planner tochoose wrong join - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: Inaccurate Rows estimate for "Bitmap And" causes Planner tochoose wrong join
Date
Msg-id 20200506163457.GM28974@telsasoft.com
Whole thread Raw
In response to Inaccurate Rows estimate for "Bitmap And" causes Planner to choosewrong join  (Steve Pritchard <steve.pritchard@bto.org>)
List pgsql-performance
On Wed, May 06, 2020 at 05:19:48PM +0100, Steve Pritchard wrote:
> Version: Postgres 9.6.3 production system (but also tested on Postgres 12)
> 
> For my query the Planner is sometimes choosing an execution plan that uses
> "Bitmap And" (depending on the parameters):
> 
> The Planner then carries this estimate of "1 row" through the rest of the
> query (which is quite complex), and then makes poor choices about joins.
> e.g. uses "Nested Loop Left Join" because it's only expecting one row,
> whereas in practice it has to do 15636 loops which is very slow.

> Note that in cases where the Planner selects a single Index Scan for this
> query (with different parameters), the Planner makes an accurate estimate
> of the number of rows and then makes sensible selections of joins (i.e.
> quick).
> i.e. the issue seems to be with the "Bitmap And".
> 
> I don't have an index with both user_id & loc_id, as this is one of several
> different combinations that can arise (it would require quite a few indexes
> to cover all the possible combinations). However if I did have such an
> index, the planner would presumably be able to use the statistics for
> user_id and loc_id to estimate the number of rows.
> 
> So why can't it make an accurate estimate of the rows with a "Bitmap And" &
> " Bitmap Heap Scan"? (as above)

It probably *has* statistics for user_id and loc_id, but doesn't have stats for
(user_id,loc_id).

Presumbly the conditions are partially redundant, so loc_id => user_id
(strictly implies or just correlated) or the other way around.

In pg10+ you can use "CREATE STATISTICS (dependencies)" to improve that.
https://www.postgresql.org/docs/devel/sql-createstatistics.html

Otherwise you can use the "CREATE TYPE / CREATE INDEX" trick Tomas described here:
https://www.postgresql.org/message-id/20190424003633.ruvhbv5ro3fawo67%40development

-- 
Justin



pgsql-performance by date:

Previous
From: Steve Pritchard
Date:
Subject: Inaccurate Rows estimate for "Bitmap And" causes Planner to choosewrong join
Next
From: Jeff Janes
Date:
Subject: Re: Inaccurate Rows estimate for "Bitmap And" causes Planner tochoose wrong join