Optimizer speed and GEQO (was: nested loops in joins) - Mailing list pgsql-hackers

I have been looking at optimizer runtime using Charles Hornberger's
example case, and what I find is that the number of tables involved in
the query is a very inadequate predictor of the optimizer's runtime.
Thus, it's not surprising that we are getting poor results from using
number of tables as the GEQO threshold measure.

I started with Charles' test case as presented, then simplified it by
removing indexes from the database.  This didn't change the final
plan, which was a straight nested loop over sequential scans anyway
(because the tables were so small).  But it drastically reduces the
number of cases that the optimizer looks at.

Original test case: query involves seven tables with a total of twelve
indexes (six on the primary table and one on each other table).

7-index test case: I removed all but one of the indexes on the primary
table, leaving one index per table.

No-index test case: I removed all indexes.

It took a certain amount of patience to run these cases with profiling
turned on :-(, but here are the results:
        7T+12I        7T+7I        7T

Runtime, sec        1800        457        59
equal() calls        585 mil        157 mil        21.7 mil
better_path() calls    72546        37418        14025
bp->path_is_cheaper    668        668        668
create_hashjoin_path    198        198        198
create_mergejoin_path    2358        723        198
create_nestloop_path    38227        20297        7765

Next, I removed the last table from Charles' query, producing these
cases:
        6T+11I        6T+6I        6T

Runtime, sec        34        12        2.3
equal() calls        14.3 mil    4.7 mil        0.65 mil
better_path() calls    10721        6172        2443
bp->path_is_cheaper    225        225        225
create_hashjoin_path    85        85        85
create_mergejoin_path    500        236        85
create_nestloop_path    5684        3344        1354

The 5T+10I case is down to a couple of seconds of runtime, so I
didn't bother to do profiles for 5 tables.

A fairly decent approximation is that the runtime varies as the
square of the number of create_nestloop_path calls.  That number
in turn seems to vary as the factorial of the number of tables,
with a weaker but still strong term involving the number of indexes.
I understand the square and the factorial terms, but the form of
the dependency on the index count isn't real clear.

It might be worthwhile to try a GEQO threshold based on the number of
tables plus half the number of indexes on those tables.  I have no idea
where to find the number of indexes, so I'll just offer the idea for
someone else to try.


The main thing that jumps out from the profiles is that on these complex
searches, the optimizer spends practically *all* of its time down inside
better_path, scanning the list of already known access paths to see if a
proposed new path is a duplicate of one already known.  (Most of the time
it is not, as shown by the small number of times path_is_cheaper gets
called from better_path.)  This scanning is O(N^2) in the number of paths
considered, whereas it seems that all the other work is no worse than O(N)
in the number of paths.  The bottom of the duplicate check is equal(),
which as you can see is getting called a horrendous number of times.

It would be worth our while to try to eliminate this mostly-unsuccessful
comparison operation.

I wonder whether it would work to simply always add the new path to the
list, and rely on the later "prune" step to get rid of duplicates?
The prune code is evidently far more efficient at getting rid of useless
entries than better_path is, because it's nowhere near the top of the
profile even though it's processing nearly as many paths.

Anybody here know much about how the optimizer works?  I'm hesitant to
hack on it myself.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Patches
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Small patches in copy.c and trigger.c