On Wed, Jul 22, 2015 at 3:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > For me, the user workflow looks like these.... > > Worried: "Task X is taking ages? When is it expected to finish?" > Ops: 13:50 > <sometime later, about 14:00> > Worried: "Task X is still running? But I thought its ETA was 13:50?" > Ops: Now says 14:30 > Worried: "Is it stuck, or is it making progress?" > Ops: Looks like its making progress > Worried: "Can we have a look at it and find out what its doing?"
How does Ops know that it is making progress? Just because the completion percentage is changing?
You could, but that is not the way I suggested.
We need
* Some measure of actual progress (the definition of which may vary from action to action, e.g. blocks scanned)
* Some estimate of the total work required
* An estimate of the estimated time of completion - I liked your view that this prediction may be costly to request
> In terms of VACUUM specifically: VACUUM should be able to assess beforehand > whether it will scan the indexes, or it can just assume that it will need to > scan the indexes. Perhaps VACUUM can pre-scan the VM to decide how big a > task it has before it starts.
Well, we can assume that it will scan the indexes exactly once, but the actual number may be more or less; and the cost of rescanning the heap in phase 2 is also hard to estimate.
Maybe I'm worrying over nothing, but I have a feeling that if we try to do what you're proposing here, we're gonna end up with this:
Most of the progress estimators I have seen over the ~30 years that I've been playing with computers have been unreliable, and many of those have been unreliable to the point of being annoying. I think that's likely to happen with what you are proposing too, though of course like all predictions of the future it could turn out to be wrong.
Almost like an Optimizer then. Important, often annoyingly wrong, needs more work.
I'm not proposing this feature, I'm merely asking for it to be defined in a way that makes it work for more than just VACUUM. Once we have a way of reporting useful information, other processes can be made to follow that mechanism, like REINDEX, ALTER TABLE etc.. I believe those things are important, even if we never get such information for user queries. But I hope we do.
I won't get in the way of your search for detailed information in more complex forms. Both things are needed.
--
Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services