PROPOSAL: geqo improvement - Mailing list pgsql-hackers

From marcin mank
Subject PROPOSAL: geqo improvement
Date
Msg-id b1b9fac60901041755o2d84d354w9ba0871e03f56600@mail.gmail.com
Whole thread Raw
Responses Re: PROPOSAL: geqo improvement  ("Robert Haas" <robertmhaas@gmail.com>)
List pgsql-hackers
Hello, List.

There are cases when GEQO returns a very bad plan in some rare
executions of a query. To decrease likehood of this happening, I
propose:

When GEQO detects that what it found is in fact a miserable plan  it
restarts the search. Simple math shows that if the probability of a
bad plan found in one 'go' is p, the overall probability of a bad plan
is p^N  .

GEQO would decide that the plan is bad when the calculated cost of the
plan would exceed the time spent planning so far a fixed number of
times (100 ? a configurable parameter ?) .
I think a function infering cost from time spent could be calculated
from cpu_operator_cost - or is there a better way?

As a safety mean I wish to limit the number of replannings to a fixed
value (10? 20? a configurable parameter?)

If I introduce some configuration variables, I plan to infer the
defaults from geqo_effort (no concrete plan for this now).

An alternative to restarting the search might be just extending it -
running the main loop of geqo() function longer.  I plan restarting
because I`m afraid the real reason for getting bad plans could be that
the algorithm is getting into some local minimum and can`t get out. I
will explore that more.

If there is agreement to do this, it looks simple enough that I
volunteer to implement it. Please tell me what is the deadline for
this to make into 8.4 .

What I lack is good test cases to verify the solution.

Greetings
Marcin Mańk

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)
Next
From: "Robert Haas"
Date:
Subject: Re: PROPOSAL: geqo improvement