Re: Query progress indication - an implementation - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Query progress indication - an implementation
Date
Msg-id 407d949e0906291634j3262922dg38b5774b2a84e31c@mail.gmail.com
Whole thread Raw
In response to Re: Query progress indication - an implementation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Query progress indication - an implementation  (Ron Mayer <rm_pg@cheapcomplexdevices.com>)
Re: Query progress indication - an implementation  (Dimitri Fontaine <dfontaine@hi-media.com>)
List pgsql-hackers
>> On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote:
>>> I think this is pretty much nonsense --- most queries run all their plan
>>> nodes concurrently to some extent.  You can't usefully say that a query
>>> is "on" some node, nor measure progress by whether some node is "done".

Right, that was why my proposed interface was to dump out the explain
plan with the number of loops, row counts seen so far, and approximate
percentage progress.

My thinking was that a human could interpret that to understand where
the bottleneck is if, say you're still on the first row for the top
few nodes but all the nodes below a certain sort have run to
completion that the query is busy running the sort...

But a tool like psql or pgadmin would receive that and just display
the top-level percent progress. pgadmin might actually be able to
display its graphical explain with some graphical representation of
the percent progress of each node.

We can actually do *very* well for percent progress for a lot of
nodes. Sequential scans or bitmap scans, for example, can display
their actual percent done in terms of disk blocks.

The gotcha I ran into was what to do with a nested loop join. The safe
thing to do would be to report just the outer child's percentage
directly. But that would perform poorly in the not uncommon case where
there's one expected outer tuple. If we could trust the outer estimate
we could report (outer-percentage + (1/outer-estimate *
inner-percentage)) but that will get weird quickly if the
outer-percentage turns out to be underestimated.

Basically I disagree that imperfect progress reports annoy users. I
think we can do better than reporting 250% done or having a percentage
that goes backward though. It would be quite tolerable (though perhaps
for no logical reason) to have a progress indicator which slows done
as it gets closer to 100% and never seems to make it to 100%.

--
greg
http://mit.edu/~gsstark/resume.pdf


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: pre-proposal: permissions made easier
Next
From: Nathan Boley
Date:
Subject: Re: Multi-Dimensional Histograms