Tom Lane <tgl@sss.pgh.pa.us> writes:
> Greg Stark <gsstark@mit.edu> writes:
> > I'm not about to run analyze in the middle of the data generation
> > (which wouldn't work anyways since it's in a transaction).
>
> Since 7.3 or 7.4, you *can* run ANALYZE in the middle of a transaction.
> The cached-plan business is a problem, I agree, but I think it's
> orthogonal to this particular discussion (and you can always use EXECUTE
> if you have to).
It's orthogonal. My point was that I have a bigger problem, but even if I
address it by switching away from plpgsql, or I guess by using EXECUTE, I
would still have a problem. I didn't realize you could run analyze in a
transaction, but even being able to I wouldn't really want to have to do that
repeatedly during the job.
This new approach would actually complete the fix, a perl or plpgsql EXECUTE
implementation would gradually shift statistics during the job.
Except that the first thing the job does is delete all the old records. This
is inside a transaction. So an estimate based on the heap size would be off by
a factor of two by the time the job is done.
> but separate ANALYZE could definitely make an estimate of the fraction of
> dead tuples.
With analyze in a transaction I'm not clear what the semantics should be
though. I suppose it should only count tuples visible to the transaction
analyze?
> Nope, you aren't. The above seems to me to be a recipe for degradation
> of performance over time, precisely because the plans wouldn't change in
> the face of changes in the situation.
A gradual degradation is ok. A gradual degradation means I can schedule a
nightly analyze and report on any changed plans and either automatically
accept them or manually approve them individually.
A sudden degradation is much more dangerous. Even if it's rare, a sudden
degradation means an outage in prime time.
As I said, it doesn't matter to me if every query is 10% slower than possible,
as long as no query takes 1000% as long as necessary even if it's a 1 in 1000
occurrence.
> I've resisted adding "planner hints" to the language for this reason, and
> I'm certainly not eager to offer any hard guarantees about plans not
> changing.
I just want to control _when_ they change. Eventually you'll come around. I
think it'll be a slow gradual change in thinking as the user-base changes
though. Not something I'll change with a single argument in one day.
--
greg