Thread: Bad estimation for "where field not in"
Hello, We have a table with about 60M records, almost all of which in one of two statuses ('done', 'failed') and a few of them, usually < 1000, in different transient statuses. We also have a partial index indexing the transient items: where status not in ('done', 'failed'). Stats are about right for the status field: stavalues1 | stanumbers1 ---------------+--------------------- {done,failed} | {0.541767,0.458233} Trying to access only the transient items leads to very bad records estimations, and the choice of poor query plans, as the estimate is to read some 25% of the items table (so joins are often performed against full scans of other large tables instead of using indexes): explain analyze select count(*) from items where status not in ('done', 'failed'); Aggregate (cost=2879903.86..2879903.87 rows=1 width=0) (actual time=0.460..0.461 rows=1 loops=1) -> Bitmap Heap Scan on items (cost=3674.23..2843184.08 rows=14687908 width=0) (actual time=0.393..0.453 rows=20 loops=1) Recheck Cond: (((status)::text <> 'done'::text) AND ((status)::text <> 'failed'::text)) -> Bitmap Index Scan on i_items_transient_status (cost=0.00..2.26 rows=14687908 width=0) (actual time=0.381..0.381 rows=38 loops=1) Looking at these estimate of the rows in the table (59164756) and the estimate of the filtered rows (14687908), looks like the planner is calculating the probability of the status being neither done nor failed as two events independent events: =# select 59164555 * (1 - 0.541767) * (1 - 0.458233); 14687927.231665933605 while it is clear (at least in the original query specification, before splitting the condition in two ANDed parts) that the two events are disjoint, so the probability should be calculated as 1 - (p(done) + p(failed)) instead of (1 - p(done)) * (1 - p(failed)). Writing the query without the "not", listing the other statuses, leads to a correct plan instead. Is this a known planner shortcoming or something unexpected, to be escalated to -bugs? Server version is 9.0.1. -- Daniele
On Thu, Mar 1, 2012 at 6:40 PM, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote: > Is this a known planner shortcoming or something unexpected, to be > escalated to -bugs? Server version is 9.0.1. The relevant code is in scalararraysel() function. It makes the assumption that element wise comparisons are completely independent, while the exact opposite is true. This has been this way since http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=290166f93404d8759f4bf60ef1732c8ba9a52785 introduced it to version 8.2. At least for equality and inequality ops it would be good to rework the logic to aggregate with s1 = s1 + s2 and s1 = s1 + s2 - 1 correspondingly. -- Ants Aasma
Ants Aasma <ants.aasma@eesti.ee> writes: > On Thu, Mar 1, 2012 at 6:40 PM, Daniele Varrazzo > <daniele.varrazzo@gmail.com> wrote: >> Is this a known planner shortcoming or something unexpected, to be >> escalated to -bugs? Server version is 9.0.1. > The relevant code is in scalararraysel() function. It makes the > assumption that element wise comparisons are completely independent, > while the exact opposite is true. This has been this way since > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=290166f93404d8759f4bf60ef1732c8ba9a52785 > introduced it to version 8.2. > At least for equality and inequality ops it would be good to rework > the logic to aggregate with > s1 = s1 + s2 and s1 = s1 + s2 - 1 correspondingly. Yeah, I was about to make a similar proposal. In principle, when working with a constant array, we could de-dup the array elements and then arrive at an exact result ... but that seems like expensive overkill, and in particular it'd be penalizing intelligently-written queries (which wouldn't have dups in the array to start with) to benefit badly-written ones. So it seems like the right thing is for scalararraysel to (1) check if the operator is equality or inequality, and if so (2) just assume the array elements are all different and so the probabilities sum directly. If the operator is something else it's probably best to stick with the existing logic. We could probably also protect ourselves a bit more by noting if the sum gives an impossible result (probability > 1 or < 0) and falling back to the normal calculation in that case. regards, tom lane