Re: Multi-pass planner - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Multi-pass planner
Date
Msg-id 603c8f070908200923s6e4e73c6j133d505718236803@mail.gmail.com
Whole thread Raw
In response to Multi-pass planner  (decibel <decibel@decibel.org>)
Responses Re: Multi-pass planner  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Thu, Aug 20, 2009 at 11:15 AM, decibel<decibel@decibel.org> wrote:
> There have been a number of planner improvement ideas that have been thrown
> out because of the overhead they would add to the planning process,
> specifically for queries that would otherwise be quiet fast. Other databases
> seem to have dealt with this by creating plan caches (which might be worth
> doing for Postgres), but what if we could determine when we need a fast
> planning time vs when it won't matter?
>
> What I'm thinking is that on the first pass through the planner, we only
> estimate things that we can do quickly. If the plan that falls out of that
> is below a certain cost/row threshold, we just run with that plan. If not,
> we go back and do a more detailed estimate.

It's not a dumb idea, but it might be hard to set that threshold
correctly.  You might have some queries that you know take 4 seconds
(and you don't want to spend another 200 ms planning them every time
for no benefit) and other queries that only take 500 ms (but you know
that they can be squashed down to 150 ms with a bit more planning).

I think one of the problems with the planner is that all decisions are
made on the basis of cost.  Honestly, it works amazingly well in a
wide variety of situations, but it can't handle things like "we might
as well materialize here, because it doesn't cost much and there's a
big upside if our estimates are off".  The estimates are the world,
and you live and die by them.

...Robert


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Duplicated Keys in PITR
Next
From: Andrew Dunstan
Date:
Subject: Re: explain root element for auto-explain