Re: On disable_cost - Mailing list pgsql-hackers

From Tom Lane
Subject Re: On disable_cost
Date
Msg-id 2930629.1714841852@sss.pgh.pa.us
Whole thread Raw
In response to Re: On disable_cost  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: On disable_cost
Re: On disable_cost
List pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
> I don't think you'd need to wait longer than where we do set_cheapest
> and find no paths to find out that there's going to be a problem.

At a base relation, yes, but that doesn't work for joins: it may be
that a particular join cannot be formed, yet other join sequences
will work.  We have that all the time from outer-join ordering
restrictions, never mind enable_xxxjoin flags.  So I'm not sure
that we can usefully declare early failure for joins.

> I think the int Path.disabledness idea is worth coding up to try it.
> I imagine that a Path will incur the maximum of its subpath's
> disabledness's then add_path() just needs to prefer lower-valued
> disabledness Paths.

I would think sum not maximum, but that's a detail.

> That doesn't get you the benefits of fewer CPU cycles, but where did
> that come from as a motive to change this? There's no shortage of
> other ways to make the planner faster if that's an issue.

The concern was to not *add* CPU cycles in order to make this area
better.  But I do tend to agree that we've exhausted all the other
options.

BTW, I looked through costsize.c just now to see exactly what we are
using disable_cost for, and it seemed like a majority of the cases are
just wrong.  Where possible, we should implement a plan-type-disable
flag by not generating the associated Path in the first place, not by
applying disable_cost to it.  But it looks like a lot of people have
erroneously copied the wrong logic.  I would say that only these plan
types should use the disable_cost method:

    seqscan
    nestloop join
    sort

as those are the only ones where we risk not being able to make a
plan at all for lack of other alternatives.

There is also some weirdness around needing to force use of tidscan
if we have WHERE CURRENT OF.  But perhaps a different hack could be
used for that.

We also have this for hashjoin:

     * If the bucket holding the inner MCV would exceed hash_mem, we don't
     * want to hash unless there is really no other alternative, so apply
     * disable_cost.

I'm content to leave that be, if we can't remove disable_cost
entirely.

What I'm wondering at this point is whether we need to trouble with
implementing the separate-disabledness-count method, if we trim back
the number of places using disable_cost to the absolute minimum.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Sriram RK
Date:
Subject: Re: AIX support
Next
From: Joseph Koshakow
Date:
Subject: Re: drop column name conflict