Thread: Fuzzy cost comparison to eliminate redundant planning work

Fuzzy cost comparison to eliminate redundant planning work

From
Tom Lane
Date:
I've been looking at the planner performance problem exhibited by
Eric Brown:
http://archives.postgresql.org/pgsql-performance/2004-03/msg00273.php

While a nine-way join is inherently going to take some time to plan
(if you don't constrain the search space with JOIN), it seemed to me
that this particular query was taking even longer than I'd expect.

I dug into it a little bit and found that the problem is that the
planner keeps many more paths than it really ought to.  For example,
at one random stopping point I found a list of paths that included
these three unordered paths:
  {NESTPATH   :startup_cost 0.01   :total_cost 1634.63   :pathkeys <>   }  {NESTPATH   :startup_cost 0.005  :total_cost
1642.65  :pathkeys <>   }  {NESTPATH   :startup_cost 0.00   :total_cost 1650.66   :pathkeys <>   }
 

There were also a lot of ordered paths that had similarly
trivially-different cost estimates.

The reason these are all being kept is that add_path() will keep paths
of the same ordering (same pathkeys) if they rank differently on startup
cost than they do on total cost.  Now that's a good rule in the
abstract, but a human looking at these numbers would certainly say that
it's being taken to extremes.  There isn't enough difference in these
startup costs to justify treating them as different ... especially given
the inherent inaccuracy of the estimates.

As an experiment, I made the attached patch that converts add_path's
cost comparisons to "fuzzy" comparisons --- two paths are considered to
have the same cost if it differs by less than 1% of the smaller
total_cost.  I found that this reduced the planning time of Eric's
query by about 40%, without changing the resulting plan.  On more
typical queries where the relations all have different statistics,
I doubt it would have as much effect, but I'm inclined to apply the
change anyway.

Comments?
        regards, tom lane

*** src/backend/optimizer/util/pathnode.c.orig    Tue Mar  2 11:42:20 2004
--- src/backend/optimizer/util/pathnode.c    Sun Mar 28 11:53:33 2004
***************
*** 85,90 ****
--- 85,160 ---- }  /*
+  * compare_fuzzy_path_costs
+  *      Return -1, 0, or +1 according as path1 is cheaper, the same cost,
+  *      or more expensive than path2 for the specified criterion.
+  *
+  * This differs from compare_path_costs in that we consider the costs the
+  * same if they agree to within a "fuzz factor".  This is used by add_path
+  * to avoid keeping both of a pair of paths that really have insignificantly
+  * different cost.
+  */
+ static int
+ compare_fuzzy_path_costs(Path *path1, Path *path2, CostSelector criterion)
+ {
+     Cost        fuzz;
+ 
+     /*
+      * The fuzz factor is set at one percent of the smaller total_cost, but
+      * not less than 0.01 cost units (just in case total cost is zero).
+      *
+      * XXX does this percentage need to be user-configurable?
+      */
+     fuzz = Min(path1->total_cost, path2->total_cost) * 0.01;
+     fuzz = Max(fuzz, 0.01);
+ 
+     if (criterion == STARTUP_COST)
+     {
+         if (Abs(path1->startup_cost - path2->startup_cost) > fuzz)
+         {
+             if (path1->startup_cost < path2->startup_cost)
+                 return -1;
+             else
+                 return +1;
+         }
+ 
+         /*
+          * If paths have the same startup cost (not at all unlikely),
+          * order them by total cost.
+          */
+         if (Abs(path1->total_cost - path2->total_cost) > fuzz)
+         {
+             if (path1->total_cost < path2->total_cost)
+                 return -1;
+             else
+                 return +1;
+         }
+     }
+     else
+     {
+         if (Abs(path1->total_cost - path2->total_cost) > fuzz)
+         {
+             if (path1->total_cost < path2->total_cost)
+                 return -1;
+             else
+                 return +1;
+         }
+ 
+         /*
+          * If paths have the same total cost, order them by startup cost.
+          */
+         if (Abs(path1->startup_cost - path2->startup_cost) > fuzz)
+         {
+             if (path1->startup_cost < path2->startup_cost)
+                 return -1;
+             else
+                 return +1;
+         }
+     }
+     return 0;
+ }
+ 
+ /*  * compare_path_fractional_costs  *      Return -1, 0, or +1 according as path1 is cheaper, the same cost,  *
ormore expensive than path2 for fetching the specified fraction
 
***************
*** 215,243 ****         bool        remove_old = false; /* unless new proves superior */         int
costcmp;
 
!         costcmp = compare_path_costs(new_path, old_path, TOTAL_COST);          /*          * If the two paths compare
differentlyfor startup and total          * cost, then we want to keep both, and we can skip the (much          *
slower)comparison of pathkeys.    If they compare the same,          * proceed with the pathkeys comparison.  Note:
thistest relies
 
!          * on the fact that compare_path_costs will only return 0 if both
!          * costs are equal (and, therefore, there's no need to call it
!          * twice in that case).          */         if (costcmp == 0 ||
!             costcmp == compare_path_costs(new_path, old_path,
!                                           STARTUP_COST))         {             switch
(compare_pathkeys(new_path->pathkeys,old_path->pathkeys))             {                 case PATHKEYS_EQUAL:
        if (costcmp < 0)                         remove_old = true;        /* new dominates old */
else
!                         accept_new = false;        /* old equals or dominates
        * new */                     break;                 case PATHKEYS_BETTER1:                     if (costcmp <=
0)
--- 285,331 ----         bool        remove_old = false; /* unless new proves superior */         int
costcmp;
 
!         /*
!          * As of Postgres 7.5, we use fuzzy cost comparison to avoid wasting
!          * cycles keeping paths that are really not significantly different
!          * in cost.
!          */
!         costcmp = compare_fuzzy_path_costs(new_path, old_path, TOTAL_COST);          /*          * If the two paths
comparedifferently for startup and total          * cost, then we want to keep both, and we can skip the (much
*slower) comparison of pathkeys.    If they compare the same,          * proceed with the pathkeys comparison.  Note:
thistest relies
 
!          * on the fact that compare_fuzzy_path_costs will only return 0 if
!          * both costs are effectively equal (and, therefore, there's no need
!          * to call it twice in that case).          */         if (costcmp == 0 ||
!             costcmp == compare_fuzzy_path_costs(new_path, old_path,
!                                                 STARTUP_COST))         {             switch
(compare_pathkeys(new_path->pathkeys,old_path->pathkeys))             {                 case PATHKEYS_EQUAL:
        if (costcmp < 0)                         remove_old = true;        /* new dominates old */
 
+                     else if (costcmp > 0)
+                         accept_new = false;        /* old dominates new */                     else
!                     {
!                         /*
!                          * Same pathkeys, and fuzzily the same cost, so
!                          * keep just one --- but we'll do an exact cost
!                          * comparison to decide which.
!                          */
!                         if (compare_path_costs(new_path, old_path,
!                                                TOTAL_COST) < 0)
!                             remove_old = true; /* new dominates old */
!                         else
!                             accept_new = false;    /* old equals or dominates
        * new */
 
+                     }                     break;                 case PATHKEYS_BETTER1:                     if
(costcmp<= 0)
 


Re: Fuzzy cost comparison to eliminate redundant planning

From
Mike Mascari
Date:
Tom Lane wrote:
> I've been looking at the planner performance problem exhibited by
> Eric Brown:
> http://archives.postgresql.org/pgsql-performance/2004-03/msg00273.php
> 
> While a nine-way join is inherently going to take some time to plan
> (if you don't constrain the search space with JOIN), it seemed to me
> that this particular query was taking even longer than I'd expect.

...
> I found that this reduced the planning time of Eric's> query by about 40%, without changing the resulting plan.

More great news, as always. IIRC you recently bumped the default 
GEQO threshold from eleven to twelve. With your new fuzzy comparison 
patch is twelve still the appropriate number? Or does the fuzzy 
comparison scale all planning time down and therefore the default 
threshold should remain where it is?

Mike Mascari





Re: Fuzzy cost comparison to eliminate redundant planning work

From
Tom Lane
Date:
Mike Mascari <mascarm@mascari.com> writes:
> Tom Lane wrote:
>>> I found that this reduced the planning time of Eric's
>>> query by about 40%, without changing the resulting plan.

> More great news, as always. IIRC you recently bumped the default 
> GEQO threshold from eleven to twelve. With your new fuzzy comparison 
> patch is twelve still the appropriate number?

I'm not sure that this change will make much difference at all to
"typical" queries.  Eric's query is something of an outlier because it
is a nine-way self-join, which means that all the base tables have
exactly the same statistics and so there are a lot more join paths with
identical cost estimates than you'd normally expect.

However, the paths being gotten rid of by the patch aren't quite
identical in cost, so maybe there will be some effect for more typical
queries.  If you have some expensive-to-plan queries, please do try it
and report back.  AFAIK the given patch should drop into 7.4 or even
earlier without much problem.
        regards, tom lane


Re: Fuzzy cost comparison to eliminate redundant planning

From
Joe Conway
Date:
Tom Lane wrote:
> As an experiment, I made the attached patch that converts add_path's
> cost comparisons to "fuzzy" comparisons --- two paths are considered to
> have the same cost if it differs by less than 1% of the smaller
> total_cost.  I found that this reduced the planning time of Eric's
> query by about 40%, without changing the resulting plan.  On more
> typical queries where the relations all have different statistics,
> I doubt it would have as much effect, but I'm inclined to apply the
> change anyway.

It looks like a great idea to me...
> +      * XXX does this percentage need to be user-configurable?

...but I think the answer to the above is yes.

Joe


Re: Fuzzy cost comparison to eliminate redundant planning work

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>>> +      * XXX does this percentage need to be user-configurable?

> ...but I think the answer to the above is yes.

I'd want to see some evidence that there's use in twiddling it ...
in my initial test, setting it to 10% instead of 1% didn't seem to
do much.
        regards, tom lane


Re: Fuzzy cost comparison to eliminate redundant planning

From
Neil Conway
Date:
On 28-Mar-04, at 8:32 PM, Joe Conway wrote:
> It looks like a great idea to me...
>
> > +      * XXX does this percentage need to be user-configurable?
>
> ...but I think the answer to the above is yes.

Is this really a parameter that we can expect administrators to be able 
to understand and tune very effectively?

-Neil



Re: Fuzzy cost comparison to eliminate redundant planning

From
Joe Conway
Date:
Neil Conway wrote:
> On 28-Mar-04, at 8:32 PM, Joe Conway wrote:
>> > +      * XXX does this percentage need to be user-configurable?
>>
>> ...but I think the answer to the above is yes.
> 
> Is this really a parameter that we can expect administrators to be able 
> to understand and tune very effectively?

I think so. For one thing, if it were user-controllable, then it could 
be turned off for backward compatible behavior (i.e. 0 means no fuzz).

Joe



Re: Fuzzy cost comparison to eliminate redundant planning

From
Bruce Momjian
Date:
Tom Lane wrote:
> The reason these are all being kept is that add_path() will keep paths
> of the same ordering (same pathkeys) if they rank differently on startup
> cost than they do on total cost.  Now that's a good rule in the
> abstract, but a human looking at these numbers would certainly say that
> it's being taken to extremes.  There isn't enough difference in these
> startup costs to justify treating them as different ... especially given
> the inherent inaccuracy of the estimates.
> 
> As an experiment, I made the attached patch that converts add_path's
> cost comparisons to "fuzzy" comparisons --- two paths are considered to
> have the same cost if it differs by less than 1% of the smaller
> total_cost.  I found that this reduced the planning time of Eric's
> query by about 40%, without changing the resulting plan.  On more
> typical queries where the relations all have different statistics,
> I doubt it would have as much effect, but I'm inclined to apply the
> change anyway.

I like the 1% idea.  It is sort of like a cheap GEQO  --- throwing out
plans that have very similar costs.  We still test all plans (unlike
GEQO) but we trim more agressively than we do now.

Do we know in the optimizer whether we will be needing cheapest startup
or not?  Could we eliminate cheapest startup plans when they aren't
needed?

In the pre-cheapest path code, we really only kept cheapest cost, plus
plans with pathkeys.  Now that we have cheapest, do we need to rethink
what plans we keep?

Let's look at the plans you listed:
  {NESTPATH  :startup_cost 0.01  :total_cost 1634.63  :pathkeys <>  }
  {NESTPATH  :startup_cost 0.005  :total_cost 1642.65  :pathkeys <>  }
  {NESTPATH  :startup_cost 0.00  :total_cost 1650.66  :pathkeys <>  }

They are listed in order of increasing total cost, and decreasing
startup cost.  Why is the middle one kept?

You state:

> The reason these are all being kept is that add_path() will keep paths
> of the same ordering (same pathkeys) if they rank differently on startup
> cost than they do on total cost.  Now that's a good rule in the

Is the middle one kept because the optimizer has to mix the startup plus
some percentage of the total cost for queries using LIMIT?

Should we require that kept plans have to have a greater difference in
startup cost compared to total cost.  For example:
  {NESTPATH  :startup_cost 0.01  :total_cost 1634.63  :pathkeys <>  }
  {NESTPATH  :startup_cost 0.005  :total_cost 1642.65  :pathkeys <>  }

In this case, the total cost difference is +8, but the startup cost
difference is only -0.005.

Or looking at these plans:

  {NESTPATH  :startup_cost 0.01  :total_cost 1634.63  :pathkeys <>  }
  {NESTPATH  :startup_cost 0.005  :total_cost 1642.65  :pathkeys <>  }
  {NESTPATH  :startup_cost 0.00  :total_cost 1650.66  :pathkeys <>  }


If you are selecting the whole table, you pick the first first plan, and
if you are selecting only one row, you pick the last plan.  Looking at
some computations:> 1634.63 * 0.0001 + 0.01        .173463> 1642.65 * 0.0001 + 0.005        .169265> 1650.66 * 0.0001
    .165066
 

Pulling 0.01% of the table has us prefering the last plan, while > 1634.63 * 0.001 + 0.01        1.64463> 1642.65 *
0.001+ 0.005        1.64765> 1650.66 * 0.001        1.65066
 

has us pulling from the first plan, so the useful range of the second
query is between 0.01% and 0.001% of the table.  That's a pretty tiny
area of usefulness for the second plan.

However, I can't think of how to measure the range of percentage of
usefulness a plan.

-- Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Fuzzy cost comparison to eliminate redundant planning work

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Do we know in the optimizer whether we will be needing cheapest startup
> or not?

No.  Higher levels might want either.

> Is the middle one kept because the optimizer has to mix the startup plus
> some percentage of the total cost for queries using LIMIT?

Right.  There are potentially some ranges of LIMIT for which it could
win, I believe.  Maybe with some math you could prove there is no range
in which the other two don't dominate it, but I suspect the extra logic
would slow down add_path more than it's worth.
        regards, tom lane


Re: Fuzzy cost comparison to eliminate redundant planning

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Do we know in the optimizer whether we will be needing cheapest startup
> > or not?
> 
> No.  Higher levels might want either.
> 
> > Is the middle one kept because the optimizer has to mix the startup plus
> > some percentage of the total cost for queries using LIMIT?
> 
> Right.  There are potentially some ranges of LIMIT for which it could
> win, I believe.  Maybe with some math you could prove there is no range
> in which the other two don't dominate it, but I suspect the extra logic
> would slow down add_path more than it's worth.

What if we take the total cost and divide it by the number of rows returned ---
then we have a per-row cost for each plan. Then we subtract the two, and
that difference compared to the difference in startup costs tell us how
many rows could potentially use this plan.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Fuzzy cost comparison to eliminate redundant planning

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> Right.  There are potentially some ranges of LIMIT for which it could
>> win, I believe.

> What if we take the total cost and divide it by the number of rows returned ---
> then we have a per-row cost for each plan. Then we subtract the two, and
> that difference compared to the difference in startup costs tell us how
> many rows could potentially use this plan.

You're missing the point.  We are comparing plans where one has a higher
start cost and a lower total cost than the other.  For example (pardon
the crummy ASCII art):
                  A                 A                A               A        B              A      B             A
B           A  B           AB                     BA              B  A   B    AB      A      A     A    A   A
 

where the X-axis is the percentage of tuples we expect to fetch, and the
Y-axis is the estimated cost.  We have to keep both plans since upper
levels might want to fetch different percentages depending on what plan
is being considered up there; so either A or B might be the thing to use.

Now if we consider *three* plans:
                  A                 A                A               A        B              A      B             A
B           A  B           AB           C         BA      C       B CA
 
C   B    AB      A      A     A    A   A

Here, plan B loses everywhere: to A at lower percentages and to C at
higher ones.  But I don't see how to eliminate B on the basis of
one-at-a-time comparisons.  It seems like getting rid of B would require
turning add_path into an O(N^3) instead of O(N^2) algorithm ... which
would almost certainly eat more cycles than it'd save.
        regards, tom lane


Re: Fuzzy cost comparison to eliminate redundant planning

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> Right.  There are potentially some ranges of LIMIT for which it could
> >> win, I believe.
> 
> > What if we take the total cost and divide it by the number of rows returned ---
> > then we have a per-row cost for each plan. Then we subtract the two, and
> > that difference compared to the difference in startup costs tell us how
> > many rows could potentially use this plan.
> 
> Here, plan B loses everywhere: to A at lower percentages and to C at
> higher ones.  But I don't see how to eliminate B on the basis of
> one-at-a-time comparisons.  It seems like getting rid of B would require
> turning add_path into an O(N^3) instead of O(N^2) algorithm ... which
> would almost certainly eat more cycles than it'd save.

Nice charts.  :-)

I agree we don't want anything that is O(high), but I was thinking of
something that would be more agressive than 1%, which works well for
lots of self joins, but I am not sure how well for other cases. 
Consider these plans that return 10 rows:
total    startup    total per row retrieved
plan1    1    1    .1
plan2    5    0.5    .5
plan3    10    0    1

Now, the difference between plan1 and plan2 total is 500%, yet it is a
useless plan.  If you want to retrieve one row, you pick plan3, if you
want 2 or more rows, you pick plan1.

If the per-row total cost plus the startup cost is less than another's,
we can throw it away. In fact, when we go check for cheapest startup
cost to keep, do we at least assume we have one row fetched?

What if instead of doing total cost 1% difference, we compute
total-per-row + startup being 1% different?  Does that catch more
similar cost plans?  Seems it would.

In your example, how many rows were returned?  I can see how this would
have handled that case.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Fuzzy cost comparison to eliminate redundant planning

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I agree we don't want anything that is O(high), but I was thinking of
> something that would be more agressive than 1%, which works well for
> lots of self joins, but I am not sure how well for other cases. 

That assumption is without foundation.  The particular case we are
looking at in Eric's example has a problem only because there is one
cpu_operator_cost more or less in the estimated startup costs.
I believe that the problem was actually created recently (7.4 or
possibly 7.3) by planner changes that account for expression evaluation
costs more completely than we used to do.  This is important when an
expression involves an expensive sub-select, but for typical cases it
simply results in very small deltas between startup costs of
otherwise-similar plans.  1% fuzz is plenty to fix that.

Before asserting that we need more flexibility, please point to some
real cases where it's needed.  Your argument depends on numbers pulled
out of the air that don't necessarily have anything to do with the
planner's actual behavior.

> What if instead of doing total cost 1% difference, we compute
> total-per-row + startup being 1% different?

Doesn't seem to me to have useful properties...
        regards, tom lane