Jim,
> Well, that raises an interesting issue, because AFAIK none of the cost
> estimate functions currently do that. Heck, AFAIK even the piggyback
> seqscan code doesn't take other seqscans into account.
Sure. But you're striving for greater accuracy, no?
Actually, all that's really needed in the way of concurrent activity is a
calculated factor that lets us know how likely a particular object is to be
cached, either in the fs cache or the pg cache (with different factors for
each presumably) based on history. Right now, that's based on
estimated_cache_size, which is rather innacurate: a table which is queried
once a month has the exact same cost factors as one which is queried every
2.1 seconds. This would mean an extra column in pg_stats I suppose.
> But ultimately, I'm not sure if this is really required or not, because
> I don't see that we need to use explain when running queries. In fact,
> it's possibly desireable that we don't, because of the overhead it
> incurs. We would want to log an explain (maybe analyze) just to make
> sure we knew what the optimizer was doing, but I think we shouldn't need
> the info to produce cost estimates.
Well, the problem is that you need to know how much time the index scan took
vs. other query steps. I don't see a way to do this other than an anayze.
--
__Aglio Database Solutions_______________
Josh Berkus Consultant
josh@agliodbs.com www.agliodbs.com
Ph: 415-752-2500 Fax: 415-752-2387
2166 Hayes Suite 200 San Francisco, CA