Re: Bogus nestloop rows estimate in 8.4.7 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Bogus nestloop rows estimate in 8.4.7
Date
Msg-id CA+TgmoagquAwVpk=mYUaLnQ27guYpz4naUp04_OqKuOz=PFFJw@mail.gmail.com
Whole thread Raw
In response to Re: Bogus nestloop rows estimate in 8.4.7  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bogus nestloop rows estimate in 8.4.7
List pgsql-hackers
On Mon, May 28, 2012 at 6:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Marti Raudsepp <marti@juffo.org> writes:
>> On Mon, May 28, 2012 at 11:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> However, the error in your original example is far too large to be
>>> explained by that, so I think it was tripping over something different.
>
>> Good point. But I generated a bigger data set with the above test case
>> and it gets progressively worse with more rows and partitions. (The
>> original database has 2x4 billion rows in over 100 partitions)
>
>> Here's a bigger test case, 2GB total (will take a few minutes to
>> generate). It gives a total estimate of 3900158, even though the
>> Append nodes suggest 13x2406 rows.
>
> On reflection I think this is an artifact of the lack of
> inheritance-tree stats in 8.4.  The estimated size of the join does
> *not* come from the product of the two appendrel sizes shown in EXPLAIN,
> because the inner one is a inner indexscan using a parameter from the
> outer side (what we would now call a parameterized path).  Rather, the
> estimated size is join selectivity times outer relation size times inner
> relation size.  The outer relation size, after applying its restriction
> clause, is indeed only 13 rows, but the inner relation size is 60e6 rows
> because it has no restriction clause.  If we had an accurate join
> selectivity estimate that'd be fine, but for lack of any stats about the
> inheritance tree eqjoinsel just punts and returns DEFAULT_EQ_SEL, ie
> 0.005.  And that works out to your result.

Hmm, but isn't this a case of the left hand not knowing what the right
hand is doing?  I mean, somehow we have enough information to estimate
that the index scans on b{1,2,3} are going to produce 2 rows per
execution, but having figured that out (correctly) we then proceed to
ignore it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)
Next
From: Tom Lane
Date:
Subject: Re: Bogus nestloop rows estimate in 8.4.7