On Mon, 2006-06-05 at 17:06 -0400, Tom Lane wrote:
> Andrew Sullivan <ajs@crankycanuck.ca> writes:
> > On Mon, Jun 05, 2006 at 04:07:19PM -0400, Tom Lane wrote:
> >> I'm wondering about out-of-date or nonexistent ANALYZE stats, missing
> >> custom adjustments of statistics target settings, etc.
>
> > But even the nested loop shouldn't be a "never returns" case, should
> > it? For 1800 rows?
>
> Well, it's a big query. If it ought to take a second or two, and
> instead is taking an hour or two (1800 times the expected runtime), that
> might be close enough to "never" to exhaust Chris' patience. Besides,
> we don't know whether the 1800 might itself be an underestimate (too bad
> Chris didn't provide EXPLAIN ANALYZE results).
This is a good example of a case where the inefficiency of EXPLAIN
ANALYZE would be a contributory factor to it not actually being
available for diagnosing a problem.
Maybe we need something even more drastic than recent proposed changes
to EXPLAIN ANALYZE?
Perhaps we could annotate the query tree with individual limits. That
way a node that was expecting to deal with 1 row would simply stop
executing the EXPLAIN ANALYZE when it hit N times as many rows
(default=no limit). That way, we would still be able to see a bad plan
even without waiting for the whole query to execute - just stop at a
point where the plan is far enough off track. That would give us what we
need: pinpoint exactly which part of the plan is off-track and see how
far off track it is. If the limits were configurable, we'd be able to
opt for faster-but-less-accurate or slower-yet-100% accuracy behaviour.
We wouldn't need to worry about timing overhead either then.
e.g. EXPLAIN ANALYZE ERRLIMIT 10 SELECT ...
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com