Thread: The logic behind comparing generic vs. custom plan costs

The logic behind comparing generic vs. custom plan costs

From
Justin Blank
Date:
I've been looking into the way that postgres decides whether to use a
custom or generic plan. I believe I understand how, but I have
confused myself about the why.

In plancache.c, the code always executes a custom plan for the first
five iterations, and afterwards, it compares the average cost of the
custom plan (including the planning cost) to the cost of a generic
plan (not including planning), and choose the option with the lower
cost.

My idea had been that even if the custom plans average higher cost
than the generic plan, it is still worth considering custom plans. If
1 time in 5, the custom plan is 10% of the cost of the generic plan,
it doesn't matter if the average custom plan is worse, you should at
least determine the cost of the custom plan before deciding whether to
use it or the generic plan. Even if 80% of the time, you don't use the
custom plan, the payoff in the good case may be worth it.

However, after starting to implement that idea, I became confused, and
the underlying logic of comparing custom and generic plans stopped
making sense to me.

I can think of two different reasons the cost estimates for the
generic plan and custom plan can differ:

1. The two plans substantively differ: they use a different join
   order, different join strategy, or different indexes.
2. They do not fundamentally differ, but the custom plan makes
   different estimates of the selectivity of predicates because it
   knows their precise values.

For the sake of deciding between custom and generic plans, it seems
like only the first type of difference matters.

So my first question is whether there's any value in comparing the
cost estimates for custom plans that have the same access plan as the
generic plan.

Continuing to cases where the plans differ meaningfully, I'm not sure
that comparison makes sense either.

If the custom plan beats the generic plan, it seems reasonable to take
that as a real improvement.

However, if the custom plan has a higher cost than the generic plan,
this may or may not count against the custom plan. The planner used
the bind parameters of the specific query. So it may be that the
custom plan really is worse (the planner has made a mistake) but it
may also be that it considered the access plan of the generic plan,
and it was worse for those particular bind parameters. In this case,
the custom plan appears worse than the generic plan, but would ideally
be preferred.

Does my reasoning make sense? Or have I misunderstood something about
this process and how to reason about these cost estimates? I'm very
unsure, and worrying I've completely misunderstood, as if I'm right,
the logic in plancache.c doesn't make much sense.

Justin Blank



Re: The logic behind comparing generic vs. custom plan costs

From
"David G. Johnston"
Date:
On Sat, Mar 15, 2025 at 10:42 AM Justin Blank <justin.blank@gmail.com> wrote:
My idea had been that even if the custom plans average higher cost
than the generic plan, it is still worth considering custom plans. If
1 time in 5, the custom plan is 10% of the cost of the generic plan,
it doesn't matter if the average custom plan is worse, you should at
least determine the cost of the custom plan before deciding whether to
use it or the generic plan. Even if 80% of the time, you don't use the
custom plan, the payoff in the good case may be worth it.


In theory, while ignoring planning costs, a generic plan will never outperform a custom plan (the custom plan establishes a floor any custom plan could fall back on).  Thus any algorithm that requires computing the custom plan unconditionally amounts to simply setting the GUC to infinity (instead of 5).

The assumption is that at moderate to high reuse volumes it is quite probable that a generic plan will win or at least be acceptable in 999 in 1000 or more executions, not 4 in 5.

David J.

Re: The logic behind comparing generic vs. custom plan costs

From
Tom Lane
Date:
Justin Blank <justin.blank@gmail.com> writes:
> My idea had been that even if the custom plans average higher cost
> than the generic plan, it is still worth considering custom plans. If
> 1 time in 5, the custom plan is 10% of the cost of the generic plan,
> it doesn't matter if the average custom plan is worse, you should at
> least determine the cost of the custom plan before deciding whether to
> use it or the generic plan. Even if 80% of the time, you don't use the
> custom plan, the payoff in the good case may be worth it.

What you'd have to argue is that the average cost of the custom plans
(including the planning time) is less than the average cost of the
generic plan (including planning time to make a custom plan that's
then not used).  I'm not exactly convinced that that's any better
than "always use a custom plan" mode: once you've gone to the
trouble of making a custom plan, why not use it?

I'd be the first to agree that this logic is squishy as heck, and
I'd love to find a better way.  There are a couple of very nasty
problems to think about, though:

* The estimate of the cost of planning is pretty laughable.  It's
not a very detailed estimate, and even if it were there's a scale
problem in comparing it to estimated execution costs.  So I'm
hesitant to put any great weight on that estimate.

* Even comparing the estimated costs of custom and generic plans
is full of pitfalls, because they are based on different estimates
of the selectivity of WHERE clauses.  We've had reports for instance
of the cache picking a generic plan that was awful for the actual
parameter values, because the generic plan's cost was based on an
assumed-average case, while the custom plans' cost reflected reality
more accurately and hence were higher than the generic estimate even
though in reality they were the better choice.  (I don't recall
precise details, but I think it was something like the actual
parameter matching a very common value so that the number of rows
involved would be much greater than the generic plan assumed.)

Without some way to tame those problems, I'm not sure that marginal
tweaks in the plan-selection rule will accomplish a lot.

One idea perhaps could be to pay attention to how much the custom
plans' estimated rowcounts move around, and only switch to a generic
plan if there's not too much change and the generic plan's estimated
rowcounts are in the same range.  Or in other words, rather than
believing a cost comparison, what we'd look for is whether we get
basically the same plan in generic mode as in custom mode.  I'm
not sure how to mechanize that, though.

            regards, tom lane