Re: planer picks a bad plan (seq-scan instead of index) when adding an additional join - Mailing list pgsql-general

From Tom Lane
Subject Re: planer picks a bad plan (seq-scan instead of index) when adding an additional join
Date
Msg-id 13288.1163119194@sss.pgh.pa.us
Whole thread Raw
In response to Re: planer picks a bad plan (seq-scan instead of index) when adding an additional join  ("Thomas H." <me@alternize.com>)
Responses Re: planer picks a bad plan (seq-scan instead of index) when adding an additional join
List pgsql-general
"Thomas H." <me@alternize.com> writes:
>> Um, what's the datatype of sm_info_ean and dvd_ean exactly?

> varchar(15) and varchar(14)

OK.  I was wondering if you'd tried to use the new contrib/isn code and
it was messing up the estimates somehow.  Seems like a red herring.

After looking more closely, I think the issue is that the size of the
dvds/movies join is estimated at 39900 rows when it's really only 2,
and this estimate discourages the planner from using a nestloop join of
that join against data_soundmedia.  We were thinking this was entirely
because of poor estimation of the ~~ operators, but I suddenly realize
that what's happening is that the size of the left join is being clamped
to be at least the size of its left input --- that is, the planner is
failing to distinguish JOIN/ON clauses (which can't suppress left-side
rows) from WHERE clauses (which can).  Per comment in
set_joinrel_size_estimates:

     * Basically, we multiply size of Cartesian product by selectivity.
     *
     * If we are doing an outer join, take that into account: the output must
     * be at least as large as the non-nullable input.  (Is there any chance
     * of being even smarter?)  (XXX this is not really right, because it
     * assumes all the restriction clauses are join clauses; we should figure
     * pushed-down clauses separately.)

This didn't matter a whole lot back when the planner couldn't reorder
outer joins, but now that it can, it's more important that the estimates
be accurate.

I'm not sure if this is feasible to fix before 8.2, but I'll take a
look.

            regards, tom lane

pgsql-general by date:

Previous
From: Cornelia Boenigk
Date:
Subject: zero values in statistic views
Next
From: Glen Parker
Date:
Subject: AutoVacuum on demand?