Re: BUG #5885: Strange rows estimation for left join - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #5885: Strange rows estimation for left join
Date
Msg-id 23761.1297790287@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #5885: Strange rows estimation for left join  (Maxim Boguk <maxim.boguk@gmail.com>)
Responses Re: BUG #5885: Strange rows estimation for left join
List pgsql-bugs
Maxim Boguk <maxim.boguk@gmail.com> writes:
> Test case look like:

> create table "references" ( attr_id integer, reference integer,
> object_id integer );
> insert into "references" select *100**(random()),
> *100000**(random()^*10*), *1000000**(random()) from
> generate_series(*1*,*10000000*);
> create index xif01references on "references" ( reference, attr_id );
> create index xif02references on "references" ( object_id, attr_id, reference );

> analyze "references";

> explain select * from "references" rs left join "references" vm on
> vm.reference = rs.reference and vm.attr_id = *10* where rs.object_id =
> *1000*;

I don't believe there's actually anything very wrong here.  The
large-looking estimate for the join size is not out of line: if you try
different values for object_id you will find that some produce more rows
than that and some produce less.  If we had cross-column stats we could
maybe derive a better estimate, but as-is you're getting an estimate
that is probably about right on the average, depending on whether the
particular object_id matches to more common or less common reference
values.

The thing that looks funny is that the inner indexscan rowcount estimate
is so small, which is because that's being done on the assumption that
the passed-in rs.reference value is random.  It's not really --- it's
more likely to be one of the more common reference values --- which is
something that's correctly accounted for in the join size estimate but
not in the inner indexscan estimate.

            regards, tom lane

pgsql-bugs by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: BUG #5886: pgbench usage
Next
From: "Nick Baryshnikov"
Date:
Subject: BUG #5887: EXPLAIN returns more than 1000 rows when table is empty