Thread: Optimizer speed and GEQO (was: nested loops in joins)

Optimizer speed and GEQO (was: nested loops in joins)

From
Tom Lane
Date:
I have been looking at optimizer runtime using Charles Hornberger's
example case, and what I find is that the number of tables involved in
the query is a very inadequate predictor of the optimizer's runtime.
Thus, it's not surprising that we are getting poor results from using
number of tables as the GEQO threshold measure.

I started with Charles' test case as presented, then simplified it by
removing indexes from the database.  This didn't change the final
plan, which was a straight nested loop over sequential scans anyway
(because the tables were so small).  But it drastically reduces the
number of cases that the optimizer looks at.

Original test case: query involves seven tables with a total of twelve
indexes (six on the primary table and one on each other table).

7-index test case: I removed all but one of the indexes on the primary
table, leaving one index per table.

No-index test case: I removed all indexes.

It took a certain amount of patience to run these cases with profiling
turned on :-(, but here are the results:
        7T+12I        7T+7I        7T

Runtime, sec        1800        457        59
equal() calls        585 mil        157 mil        21.7 mil
better_path() calls    72546        37418        14025
bp->path_is_cheaper    668        668        668
create_hashjoin_path    198        198        198
create_mergejoin_path    2358        723        198
create_nestloop_path    38227        20297        7765

Next, I removed the last table from Charles' query, producing these
cases:
        6T+11I        6T+6I        6T

Runtime, sec        34        12        2.3
equal() calls        14.3 mil    4.7 mil        0.65 mil
better_path() calls    10721        6172        2443
bp->path_is_cheaper    225        225        225
create_hashjoin_path    85        85        85
create_mergejoin_path    500        236        85
create_nestloop_path    5684        3344        1354

The 5T+10I case is down to a couple of seconds of runtime, so I
didn't bother to do profiles for 5 tables.

A fairly decent approximation is that the runtime varies as the
square of the number of create_nestloop_path calls.  That number
in turn seems to vary as the factorial of the number of tables,
with a weaker but still strong term involving the number of indexes.
I understand the square and the factorial terms, but the form of
the dependency on the index count isn't real clear.

It might be worthwhile to try a GEQO threshold based on the number of
tables plus half the number of indexes on those tables.  I have no idea
where to find the number of indexes, so I'll just offer the idea for
someone else to try.


The main thing that jumps out from the profiles is that on these complex
searches, the optimizer spends practically *all* of its time down inside
better_path, scanning the list of already known access paths to see if a
proposed new path is a duplicate of one already known.  (Most of the time
it is not, as shown by the small number of times path_is_cheaper gets
called from better_path.)  This scanning is O(N^2) in the number of paths
considered, whereas it seems that all the other work is no worse than O(N)
in the number of paths.  The bottom of the duplicate check is equal(),
which as you can see is getting called a horrendous number of times.

It would be worth our while to try to eliminate this mostly-unsuccessful
comparison operation.

I wonder whether it would work to simply always add the new path to the
list, and rely on the later "prune" step to get rid of duplicates?
The prune code is evidently far more efficient at getting rid of useless
entries than better_path is, because it's nowhere near the top of the
profile even though it's processing nearly as many paths.

Anybody here know much about how the optimizer works?  I'm hesitant to
hack on it myself.
        regards, tom lane


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
> It took a certain amount of patience to run these cases with profiling
> turned on :-(, but here are the results:

Interesting numbers.

> 
>             7T+12I        7T+7I        7T
> 
> Runtime, sec        1800        457        59
> equal() calls        585 mil        157 mil        21.7 mil

> A fairly decent approximation is that the runtime varies as the
> square of the number of create_nestloop_path calls.  That number
> in turn seems to vary as the factorial of the number of tables,
> with a weaker but still strong term involving the number of indexes.
> I understand the square and the factorial terms, but the form of
> the dependency on the index count isn't real clear.

This is exactly what I would expect.  See optimizer/README.  The
factorial makes a lot of sense.  It evaluates all joins join, removes
the cheapest from the list, and goes to find the next best one.
8 compares7 compares6 compares...

Here is a good illustration of what you are seeing:
> 6!        720> 8!        40320

That's why I reduced GEQO from 8 to 6.

> 
> It might be worthwhile to try a GEQO threshold based on the number of
> tables plus half the number of indexes on those tables.  I have no idea
> where to find the number of indexes, so I'll just offer the idea for
> someone else to try.

Sounds like a good idea.  I can easily get that information.  The
optimizer does that lower in the code.  Perhaps we can just move the
GEQO test to after the index stuff is created.  I will be able to look
at it after the TEMP stuff.

> The main thing that jumps out from the profiles is that on these complex
> searches, the optimizer spends practically *all* of its time down inside
> better_path, scanning the list of already known access paths to see if a
> proposed new path is a duplicate of one already known.  (Most of the time
> it is not, as shown by the small number of times path_is_cheaper gets
> called from better_path.)  This scanning is O(N^2) in the number of paths
> considered, whereas it seems that all the other work is no worse than O(N)
> in the number of paths.  The bottom of the duplicate check is equal(),
> which as you can see is getting called a horrendous number of times.
> 
> It would be worth our while to try to eliminate this mostly-unsuccessful
> comparison operation.
> 
> I wonder whether it would work to simply always add the new path to the
> list, and rely on the later "prune" step to get rid of duplicates?
> The prune code is evidently far more efficient at getting rid of useless
> entries than better_path is, because it's nowhere near the top of the
> profile even though it's processing nearly as many paths.

I think I optimized prune.c white a bit around 6.1.  I broke it during
the optimization, and Vadim fixed it for me.

One of my TODO items has been to 100% understand the optimizer.  Haven't
had time to do that yet.  Been on my list for a while.


> 
> Anybody here know much about how the optimizer works?  I'm hesitant to
> hack on it myself.

Let me help you.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
OK, I have modified the optimizer to count not only the table, but also
the indexes.  Patch is applied.  The code is:

   {       List       *temp;       int         paths_to_consider = 0;
       foreach(temp, outer_rels)       {           RelOptInfo *rel = (RelOptInfo *) lfirst(temp);
paths_to_consider+= length(rel->pathlist);       }
 
       if ((_use_geqo_) && paths_to_consider >= _use_geqo_rels_)           /* returns _one_ RelOptInfo, so lcons it */
        return lcons(geqo(root), NIL);     }
 


It is my understanding that it is the size of the pathlist that
determines the complexity of the optimization. (Wow, I actually
understood that sentence.)

Tom, can you tell me if this looks good, and also give a suggestion for
a possible default value for GEQO optimization.  My guess is that 6 now
is too low.  Sounds like you have a good testbed to do this.

Old posting attached.  I will look at the functions you mentioned for
possible improvement.


> I have been looking at optimizer runtime using Charles Hornberger's
> example case, and what I find is that the number of tables involved in
> the query is a very inadequate predictor of the optimizer's runtime.
> Thus, it's not surprising that we are getting poor results from using
> number of tables as the GEQO threshold measure.
> 
> I started with Charles' test case as presented, then simplified it by
> removing indexes from the database.  This didn't change the final
> plan, which was a straight nested loop over sequential scans anyway
> (because the tables were so small).  But it drastically reduces the
> number of cases that the optimizer looks at.
> 
> Original test case: query involves seven tables with a total of twelve
> indexes (six on the primary table and one on each other table).
> 
> 7-index test case: I removed all but one of the indexes on the primary
> table, leaving one index per table.
> 
> No-index test case: I removed all indexes.
> 
> It took a certain amount of patience to run these cases with profiling
> turned on :-(, but here are the results:
> 
>             7T+12I        7T+7I        7T
> 
> Runtime, sec        1800        457        59
> equal() calls        585 mil        157 mil        21.7 mil
> better_path() calls    72546        37418        14025
> bp->path_is_cheaper    668        668        668
> create_hashjoin_path    198        198        198
> create_mergejoin_path    2358        723        198
> create_nestloop_path    38227        20297        7765
> 
> Next, I removed the last table from Charles' query, producing these
> cases:
> 
>             6T+11I        6T+6I        6T
> 
> Runtime, sec        34        12        2.3
> equal() calls        14.3 mil    4.7 mil        0.65 mil
> better_path() calls    10721        6172        2443
> bp->path_is_cheaper    225        225        225
> create_hashjoin_path    85        85        85
> create_mergejoin_path    500        236        85
> create_nestloop_path    5684        3344        1354
> 
> The 5T+10I case is down to a couple of seconds of runtime, so I
> didn't bother to do profiles for 5 tables.
> 
> A fairly decent approximation is that the runtime varies as the
> square of the number of create_nestloop_path calls.  That number
> in turn seems to vary as the factorial of the number of tables,
> with a weaker but still strong term involving the number of indexes.
> I understand the square and the factorial terms, but the form of
> the dependency on the index count isn't real clear.
> 
> It might be worthwhile to try a GEQO threshold based on the number of
> tables plus half the number of indexes on those tables.  I have no idea
> where to find the number of indexes, so I'll just offer the idea for
> someone else to try.
> 
> 
> The main thing that jumps out from the profiles is that on these complex
> searches, the optimizer spends practically *all* of its time down inside
> better_path, scanning the list of already known access paths to see if a
> proposed new path is a duplicate of one already known.  (Most of the time
> it is not, as shown by the small number of times path_is_cheaper gets
> called from better_path.)  This scanning is O(N^2) in the number of paths
> considered, whereas it seems that all the other work is no worse than O(N)
> in the number of paths.  The bottom of the duplicate check is equal(),
> which as you can see is getting called a horrendous number of times.
> 
> It would be worth our while to try to eliminate this mostly-unsuccessful
> comparison operation.
> 
> I wonder whether it would work to simply always add the new path to the
> list, and rely on the later "prune" step to get rid of duplicates?
> The prune code is evidently far more efficient at getting rid of useless
> entries than better_path is, because it's nowhere near the top of the
> profile even though it's processing nearly as many paths.
> 
> Anybody here know much about how the optimizer works?  I'm hesitant to
> hack on it myself.
> 
>             regards, tom lane
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
> The main thing that jumps out from the profiles is that on these complex
> searches, the optimizer spends practically *all* of its time down inside
> better_path, scanning the list of already known access paths to see if a
> proposed new path is a duplicate of one already known.  (Most of the time
> it is not, as shown by the small number of times path_is_cheaper gets
> called from better_path.)  This scanning is O(N^2) in the number of paths
> considered, whereas it seems that all the other work is no worse than O(N)
> in the number of paths.  The bottom of the duplicate check is equal(),
> which as you can see is getting called a horrendous number of times.
> 
> It would be worth our while to try to eliminate this mostly-unsuccessful
> comparison operation.
> 
> I wonder whether it would work to simply always add the new path to the
> list, and rely on the later "prune" step to get rid of duplicates?
> The prune code is evidently far more efficient at getting rid of useless
> entries than better_path is, because it's nowhere near the top of the
> profile even though it's processing nearly as many paths.


You are definately on to something here.  I have added comments and more
to the optimizer README as I continue exploring options.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
> The main thing that jumps out from the profiles is that on these complex
> searches, the optimizer spends practically *all* of its time down inside
> better_path, scanning the list of already known access paths to see if a
> proposed new path is a duplicate of one already known.  (Most of the time
> it is not, as shown by the small number of times path_is_cheaper gets
> called from better_path.)  This scanning is O(N^2) in the number of paths
> considered, whereas it seems that all the other work is no worse than O(N)
> in the number of paths.  The bottom of the duplicate check is equal(),
> which as you can see is getting called a horrendous number of times.

The ironic thing about add_pathlist() is that the slowness is due to the
code using a nested loop to merge duplicate RelOptInfo paths, rather
than some sort of mergejoin.

I will keep studying it, but from your comments, I can see you
understood the code much better than I had.  I am just understanding
your conclusions.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
> I have been looking at optimizer runtime using Charles Hornberger's
> example case, and what I find is that the number of tables involved in
> the query is a very inadequate predictor of the optimizer's runtime.
> Thus, it's not surprising that we are getting poor results from using
> number of tables as the GEQO threshold measure.

Still digging into the optimizer, but if you want some real eye-opening
stuff, set OPTIMIZER_DEBUG and look in the postmaster log.  A six-table
join generates 55k lines of debug info, very nicely formatted.  It shows
what we are up against.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
> I have been looking at optimizer runtime using Charles Hornberger's
> example case, and what I find is that the number of tables involved in
> the query is a very inadequate predictor of the optimizer's runtime.
> Thus, it's not surprising that we are getting poor results from using
> number of tables as the GEQO threshold measure.

OK, now I have a specific optimizer question.  I looked at all
references to RelOptInfo.pathlist, and though it gets very long and hard
to check for uniqueness, the only thing I see it is used for it to find
the cheapest path.

Why are we maintaining this huge Path List for every RelOptInfo
structure if we only need the cheapest?  Why not store only the cheapest
plan, instead of generating all unique plans, then using only the
cheapest?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
"Thomas G. Lockhart"
Date:
> Why are we maintaining this huge Path List for every RelOptInfo
> structure if we only need the cheapest?  Why not store only the 
> cheapest plan, instead of generating all unique plans, then using only 
> the cheapest?

Just guessing here: does it use this same list to determine if a new
plan is a duplicate of a previously generated plan? Of course, maybe
that is not important, since any *cheaper* plan should be different from
any existing plan, and any plan of the same cost or higher could be
rejected.

Perhaps having the entire list of plans available makes it easier to
debug, especially when the stuff was in lisp (since in that environment
it is easy to traverse and manipulate these lists interactively)...
                      - Tom


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Tom Lane
Date:
>> Why are we maintaining this huge Path List for every RelOptInfo
>> structure if we only need the cheapest?

I think Bruce is onto something here... keep only the best-so-far
not everything ever generated.  That'd get rid of the O(N^2)
comparison behavior.

The only thing I can think of that we might possibly *need* the
whole list for is if it is used as a recursion stopper.
("Oh, I already investigated that alternative, no need to go down
that path again.")  It did not look to me like the optimizer does
such a thing, but I don't claim to understand the code.

It seems to me that the search space of possible paths is
well-structured and can be enumerated without generation of duplicates.
You've got a known set of tables involved in a query, a fixed set of
possible access methods for each table, and only so many ways to
combine them.  So the real question here is why does the optimizer
even need to check for duplicates --- should it not never generate
any to begin with?  The profiles I ran a few days ago show that indeed
most of the generated paths are not duplicates, but a small fraction
are duplicates.  Why is that?  I'd feel a lot closer to understanding
what's going on if we could explain where the duplicates come from.

"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> Perhaps having the entire list of plans available makes it easier to
> debug, especially when the stuff was in lisp (since in that environment
> it is easy to traverse and manipulate these lists interactively)...

The optimizer's Lisp heritage is pretty obvious, and in that culture
building a list of everything you're interested in is just The Way To
Do Things.  I doubt anyone realized that keeping the whole list would
turn out to be a performance bottleneck.
        regards, tom lane


Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From
Bruce Momjian
Date:
> >> Why are we maintaining this huge Path List for every RelOptInfo
> >> structure if we only need the cheapest?
> 
> I think Bruce is onto something here... keep only the best-so-far
> not everything ever generated.  That'd get rid of the O(N^2)
> comparison behavior.
> 
> The only thing I can think of that we might possibly *need* the
> whole list for is if it is used as a recursion stopper.
> ("Oh, I already investigated that alternative, no need to go down
> that path again.")  It did not look to me like the optimizer does
> such a thing, but I don't claim to understand the code.
> 
> It seems to me that the search space of possible paths is
> well-structured and can be enumerated without generation of duplicates.
> You've got a known set of tables involved in a query, a fixed set of
> possible access methods for each table, and only so many ways to
> combine them.  So the real question here is why does the optimizer
> even need to check for duplicates --- should it not never generate
> any to begin with?  The profiles I ran a few days ago show that indeed
> most of the generated paths are not duplicates, but a small fraction
> are duplicates.  Why is that?  I'd feel a lot closer to understanding
> what's going on if we could explain where the duplicates come from.

Here is my guess, and I think it is valid.  There are cases where a join
of two tables may be more expensive than another plan, but the order of
the result may be cheaper to use for a later join than the cheaper plan.
That is why I suspect they are doing in better_path():
       if (samekeys(path->keys, new_path->keys) &&           equal_path_ordering(&path->p_ordering,
         &new_path->p_ordering))       {           old_path = path;           break;       }
 
So I don't think we can grab the cheapest path right away, but I think
the system is keeping around non-cheapest paths with the same result
ordering.  I also would like to throw away plans that are so expensive
that sorting another plan to the desired ordering would be cheaper than
using the plan.

I am only guessing on this.  How often does this 'keep non-cheapest path
around' really help?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026