Greg Stark wrote:
> So in principle I agree with this idea. I think a conservative value
> for the constant would be more like 100x though. If I told you we had
> an easy way to speed all your queries up by 10% by caching queries but
> were just choosing not to then I think you would be unhappy. Whereas
> if I told you we were spending 1% of the run-time planning queries I
> think most people would not be concerned.
Makes sense. The main thing is that there be an order-of-magnitude
difference to hide the potential extra planning cost in. If that
includes a switched SSL connection, 10% of execution is probably
reasonable because it's a much smaller portion of overall response
time--but on a local connection it's a liability.
> There's a second problem though. We don't actually know how long any
> given query is going to take to plan or execute. We could just
> remember how long it took to plan and execute last time or how long it
> took to plan last time and the average execution time since we cached
> that plan. Perhaps we should track the stddev of the execution plan,
> or the max execution time of the plan? Ie there are still unanswered
> questions about the precise heuristic to use but I bet we can come up
> with something reasonable.
I may have cut this out of my original email for brevity... my
impression is that the planner's estimate is likely to err on the side
of scalability, not best-case response time; and that this is more
likely to happen than an optimistic plan going bad at runtime.
If that is true, then the cost estimate is at least a useful predictor
of statements that deserve re-planning. If it's not true (or for cases
where it's not true), actual execution time would be a useful back-up at
the cost of an occasional slow execution.
Yeb points out a devil in the details though: the cost estimate is
unitless. We'd have to have some orders-of-magnitude notion of how the
estimates fit into the picture of real performance.
Jeroen