Re: Bogus nestloop rows estimate in 8.4.7 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bogus nestloop rows estimate in 8.4.7
Date
Msg-id 12544.1338236586@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bogus nestloop rows estimate in 8.4.7  (Marti Raudsepp <marti@juffo.org>)
Responses Re: Bogus nestloop rows estimate in 8.4.7
List pgsql-hackers
Marti Raudsepp <marti@juffo.org> writes:
> On Mon, May 28, 2012 at 10:32 PM, Marti Raudsepp <marti@juffo.org> wrote:
>> There was a similar case in 9.0.4 with WHERE i=1, but that has been
>> fixed in 9.0.7

> Oh, it's been fixed in 9.0.7, but apparently not in 8.4.11; the empty
> parent tables are confusing the estimate:

Hmm ... what your test case seems to be exhibiting is related to this:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master [f3ff0433a] 2011-07-14 17:30:57 -0400
Branch: REL9_1_STABLE Release: REL9_1_0 [cf8245285] 2011-07-14 17:31:12 -0400
Branch: REL9_0_STABLE Release: REL9_0_5 [0dd46a776] 2011-07-14 17:31:25 -0400
   In planner, don't assume that empty parent tables aren't really empty.      There's a heuristic in
estimate_rel_size()to clamp the minimum size   estimate for a table to 10 pages, unless we can see that vacuum or
analyze  has been run (and set relpages to something nonzero, so this will always   happen for a table that's actually
empty). However, it would be better   not to do this for inheritance parent tables, which very commonly are   really
emptyand can be expected to stay that way.  Per discussion of a   recent pgsql-performance report from Anish Kejariwal.
Also prevent it   from happening for indexes (although this is more in the nature of   documentation, since CREATE
INDEXnormally initializes relpages to   something nonzero anyway).      Back-patch to 9.0, because the ability to
collectstatistics across a   whole inheritance tree has improved the planner's estimates to the point   where this
relativelysmall error makes a significant difference.  In the   referenced report, merge or hash joins were incorrectly
estimatedas   cheaper than a nestloop with inner indexscan on the inherited table.   That was less likely before 9.0
becausethe lack of inherited stats would   have resulted in a default (and rather pessimistic) estimate of the cost
ofa merge or hash join.
 

However, the error in your original example is far too large to be
explained by that, so I think it was tripping over something different.
When I run your test case in 8.4, I get
Nested Loop  (cost=0.00..1276.12 rows=28168 width=4)  Join Filter: (public.a_parent.i = public.b_parent.i)  ->  Append
(cost=0.00..62.57rows=14 width=4)        ->  Seq Scan on a_parent  (cost=0.00..46.00 rows=12 width=4)
Filter:((i >= 1) AND (i <= 2))        ->  Index Scan using a1_i_idx on a_child1 a_parent  (cost=0.00..8.29 rows=1
width=4)... ->  Append  (cost=0.00..56.63 rows=2404 width=4)        ->  Seq Scan on b_parent  (cost=0.00..34.00
rows=2400width=4)        ->  Index Scan using b1_i_idx on b_child1 b_parent  (cost=0.00..11.34 rows=2 width=4)...
 

and that join size estimate is not too out of line if you accept the
admittedly-bogus numbers for the appendrel sizes.  There seems to be
something else going on in your original example.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pg_basebackup --xlog compatibility break
Next
From: Magnus Hagander
Date:
Subject: Re: pg_basebackup --xlog compatibility break