Thread: Query progress indication - an implementation
Hi all, following the link in http://wiki.postgresql.org/wiki/Query_progress_indication but mostly: http://www.postech.ac.kr/~swhwang/progress2.pdf [1] I'm trying to write an implementation of the "dne" method in postgresql. I added another column to the pg_stat_get_activity function to report the percentage of work done for the query (of course,any other method could be used... the way the percentage is reported to the user can be easily changed). I attached a first patch (just to see if anyone is interested, the work is by no means finished). I guess I did a lot of mistakes, since I don't know anything about postgresql code... 1) the progress indicator can be eliminated at runtime; this could be done with another runtime flag (at the moment is alwayson) 2) I added a new structure (Progress) to PlanState to keep all the info about execution progress 3) I needed a pointer to the root of the PlanStates, to be able to calculate the total progress of the query tree (I betthis pointer was already available somewhere, but I couldn't find where...) 4) sub-plans are not included yet (well, just to be honest, I don't really know what postgresql means with those... :) ) 5) the percentage is updated at most every second (can be easily changed) 6) the methods to adjust upper/lower bounds in [1] are not implemented yet (but that shouldn't be a problem) 7) the "spilled tuples" handling in [1] is not supported yet 8) only hash join, loop join, aggregate, sequence scans are implemented at the moment 9) I added another flag (EXEC_FLAG_DRIVER_BRANCH) in executor.h to signal to the sub-nodes if they are part of a branch thatwill contain a driver node (for example, inner subtree of a Nested Loops join is not a driver branch). I guess this couldbe done better at Plan level (instead of PlanState), but this way less code has to be changed 10) at the moment all driver nodes have the same "work_per_tuple=1", but this could be changed (for example, CPU-intensivedriver nodes could have a smaller work_per_tuple value) Well, some (very early) tests on a tpcd db showed it works as expected (well, I only did very few tests...) Hope someone is interested
Attachment
On Fri, Jun 26, 2009 at 3:37 AM, Scara Maccai<m_lists@yahoo.it> wrote: > Hi all, > > following the link in > > http://wiki.postgresql.org/wiki/Query_progress_indication > > but mostly: > > http://www.postech.ac.kr/~swhwang/progress2.pdf [1] > > I'm trying to write an implementation of the "dne" method in postgresql. > > I added another column to the pg_stat_get_activity function to report the percentage of work done for the query (of course,any other method could be used... the way the percentage is reported to the user can be easily changed). > > I attached a first patch (just to see if anyone is interested, the work is by no means finished). > > I guess I did a lot of mistakes, since I don't know anything about postgresql code... > > 1) the progress indicator can be eliminated at runtime; this could be done with another runtime flag (at the moment isalways on) > > 2) I added a new structure (Progress) to PlanState to keep all the info about execution progress > > 3) I needed a pointer to the root of the PlanStates, to be able to calculate the total progress of the query tree (I betthis pointer was already available somewhere, but I couldn't find where...) > > 4) sub-plans are not included yet (well, just to be honest, I don't really know what postgresql means with those... :)) > > 5) the percentage is updated at most every second (can be easily changed) > > 6) the methods to adjust upper/lower bounds in [1] are not implemented yet (but that shouldn't be a problem) > > 7) the "spilled tuples" handling in [1] is not supported yet > > 8) only hash join, loop join, aggregate, sequence scans are implemented at the moment > > 9) I added another flag (EXEC_FLAG_DRIVER_BRANCH) in executor.h to signal to the sub-nodes if they are part of a branchthat will contain a driver node (for example, inner subtree of a Nested Loops join is not a driver branch). I guessthis could be done better at Plan level (instead of PlanState), but this way less code has to be changed > > 10) at the moment all driver nodes have the same "work_per_tuple=1", but this could be changed (for example, CPU-intensivedriver nodes could have a smaller work_per_tuple value) > > Well, some (very early) tests on a tpcd db showed it works as expected (well, I only did very few tests...) You might want to take a look at this: http://wiki.postgresql.org/wiki/Submitting_a_Patch The project style is not to use C++-style comments, and you should eliminate all of the unnecessary diff hunks from your patch (like files that have only comment or whitespace changes). Also, it is requested that patches be submitted in context diff format and added to the CommitFest wiki here: http://wiki.postgresql.org/wiki/CommitFest_2009-First As to the content of the patch, I think that what you are doing is comparing the actual number of "operations" with the expected number of operations. If that's correct, I'm not sure it's really all that useful, because it will only give you accurate percentage-of-completion information when the estimates are correct. But when the estimates are correct, you probably have a pretty good idea how long the query will take to run anyway. When the estimates are off, you'll find that the actual number of operations is more than the expected number of operations, but that won't really tell you how far you have to go. The only other use case I can think of for functionality of this type is some kind of dashboard view on a system with very long-running queries, where you want to see how far you have yet to go on each one (maybe to approximate when you can submit the next one) without having detailed knowledge of how expensive each individual query was project to be. But that's a pretty narrow use case, and I'm not sure it really justifies the overhead of instrumenting every query in this way. For a fraction of the run-time cost, you could include the estimated total cost of the query in the pg_stat_activity output, which would let the user do much the same thing presuming that they have some knowledge of the usual ratio between costs and execution times. Greg Stark was (is?) working on a way to get EXPLAIN-ANALYZE type statistics on running queries; you might want to take a look at some of that work and see what you think. http://wiki.postgresql.org/wiki/Greg_Stark%27s_Development_Projects#EXPLAIN_queries_in_progress ...Robert
On Sunday 28 June 2009 20:38:59 Robert Haas wrote: > The only other use case I can think of for functionality of this type > is some kind of dashboard view on a system with very long-running > queries, where you want to see how far you have yet to go on each one > (maybe to approximate when you can submit the next one) without having > detailed knowledge of how expensive each individual query was project > to be. But that's a pretty narrow use case, and I'm not sure it > really justifies the overhead of instrumenting every query in this > way. Well, progress information is probably only going to be useful for long- running processes anyway, and then only when the admin is too bored and wonders what he can usefully cancel. So it seems import to figure out exactly when to turn this on and when not to without causing too much overhead elsewhere.
> You might want to take a look at this: > http://wiki.postgresql.org/wiki/Submitting_a_Patch I will; I'm sorry it wasn't in the proper format. It was just a proof of concept, I guess I should have talked about it beforeeven sending the patch. > As to the content of the patch, I think that what you are > doing is > comparing the actual number of "operations" with the > expected number > of operations. If that's correct, I'm not sure it's > really all that > useful, because it will only give you accurate > percentage-of-completion information when the estimates are > correct. > But when the estimates are correct, you probably have a > pretty good > idea how long the query will take to run anyway. That would be a first step. Having an idea of how much a query "progressed" is very important for long-running queries. It'slike copying files: even if you have a rough idea of how much time a copy will take, having an interface that tells youthe percentage done is quite useful (IMHO). > When > the estimates > are off, you'll find that the actual number of operations > is more than > the expected number of operations, but that won't really > tell you how > far you have to go. The second phase would be using histograms to help refine the statistics at runtime. > The only other use case I can think of for functionality of > this type > is some kind of dashboard view on a system with very > long-running > queries, where you want to see how far you have yet to go > on each one > (maybe to approximate when you can submit the next one) > without having > detailed knowledge of how expensive each individual query > was project > to be. But that's a pretty narrow use case I don't think it's that narrow: it is important, for long running queries, to know how far the query processed (the userwants to know how much of a query has yet to be executed). That's why you find so many papers on query progress indicators. The real problem is that they don't give you a solution :) > Greg Stark was (is?) working on a way to get > EXPLAIN-ANALYZE type > statistics on running queries; you might want to take a > look at some > of that work and see what you think. > > http://wiki.postgresql.org/wiki/Greg_Stark%27s_Development_Projects#EXPLAIN_queries_in_progress > That's interesting. I'll take a look! Thank you very much for your comments. Is anyone interested in such a progress indicator???
On Mon, Jun 29, 2009 at 3:47 AM, Scara Maccai<m_lists@yahoo.it> wrote: > > Is anyone interested in such a progress indicator??? > I'm relatively new to Postgres and just starting to look at starting to look at what we might do with it for handling large genomic datasets. I've used Toad for Oracle to have a look at whats going on inside long running Oracle queries. Knowing that a particular step is doing a particular activity is useful for diagnostics as well as being assured that you actually are making forward progress. IMO any diagnostics you can provide for a low cost are useful. The more detail, the better. "Step 1 of 10" is good, "80% complete on step 1 of 10" is better. "80% complete on step 1, 10% complete on 10 steps" is even better. -- Peter Hunsberger
On Mon, Jun 29, 2009 at 4:47 AM, Scara Maccai<m_lists@yahoo.it> wrote: >> As to the content of the patch, I think that what you are doing is >> comparing the actual number of "operations" with the expected number >> of operations. If that's correct, I'm not sure it's really all that >> useful, because it will only give you accurate >> percentage-of-completion information when the estimates are >> correct. But when the estimates are correct, you probably have a >> pretty good idea how long the query will take to run anyway. > > That would be a first step. Having an idea of how much a query "progressed" is very important for long-running queries.It's like > copying files: even if you have a rough idea of how much time a copy will take, having an interface that tells you thepercentage > done is quite useful (IMHO). I am all in favor of a query progress indicator; the trick is constructing one that will actually be useful. It's easy to have estimates that are off by a factor of two or three, though, so I think you'd frequently have situations when the query completed when the progress estimater was at 40% or 250%. Those kinds of progress indicators tend to annoy users, and for good reason. File copying is the sort of thing where it's pretty easy to estimate percentage of completion by files and bytes; query execution is not. So, I'm all in favor of what you're trying to conceptually; I just don't like your proposed implementation. ...Robert
> It's > easy to have > estimates that are off by a factor of two or three, though, > so I think > you'd frequently have situations when the query completed > when the > progress estimater was at 40% or 250%. I thought about implementing a "given perfect estimates" indicator at first then, as a second step, using histograms to leveragethe indicator precision at run time. Of course, this doesn't mean the user wouldn't see the query completed at 40%or "slowing down" in a lot of cases... I started this patch after reading the papers in http://wiki.postgresql.org/wiki/Query_progress_indication Apparently they were able to predict query execution remaining time (in case of a "perfect estimates" query) with a verysimple algorithm. Given that: 1) The algorithm ("driver node hypothesis") is so easy 2) My project fits in the category of "perfect estimates" queries I thought "I'll give it a try". Well: I have no idea how they got their results. IMHO it's not possible to get max 10% error on query remaing time on most of the tpcd queries using that method, since the"driver nodes" have all the same "importance". I had to introduce a lot of complexity (not in the patch that I posted)to have it "somehow" working, giving the nodes different work per tuple according to the node type (example: in aloop join the time it takes to read a row of the outer relation can't be compared to, say, the time it takes to read a rowfrom a table scan: but the driver node hypothesis says they will take the same time...). So the code that I have right now works "pretty well" for the 10 queries of my project, but I guess won't work for generalqueries :( > So, I'm all in favor of what you're trying to conceptually; > I just > don't like your proposed implementation. What kind of implementation would you propose? Thank you very much for your comments.
> IMO > any > diagnostics you can provide for a low cost are > useful. The more > detail, the better. "Step 1 of 10" is good, "80% > complete on step 1 > of 10" is better. "80% complete on step 1, 10% > complete on 10 steps" > is even better. Well, I guess "Step 1 of 10" would be pretty trivial to implement (given the tree plan, how many branches have finishedexecuting). This doesn't tell you anything about the remaining time though. That is: "Step 1 of 10" doesn't mean that you are at 10%, nor that you are at 50%, or 99%. In fact, I'm afraid it doesn'ttell you anything... I don't understand how useful such an implementation would be... "being assured that you actually are making forward progress"... in which cases could you not be making forward progress? Still, this won't take much time: if it's needed by someone else I guess it could be easily done.
All, Actually, an indicator of even just what step of the query was being executed would be very useful for checking on stuck queries. If a DBA checks once that the query is on "bitmapscan on table_x(index_y)", and it's still on that 15 minutes later, he/she can guess that the query is thrashing due to HW or bad plan issues and kill it. If the query is on "sort rowset by col1" then the DBA knows not to kill it because it's almost done. So, while an actual % completed indicator would be perfect, a "query steps completed, current step =" would still be very useful and a large improvement over what we have now. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > So, while an actual % completed indicator would be perfect, a "query > steps completed, current step =" would still be very useful and a large > improvement over what we have now. I think this is pretty much nonsense --- most queries run all their plan nodes concurrently to some extent. You can't usefully say that a query is "on" some node, nor measure progress by whether some node is "done". regards, tom lane
On Mon, Jun 29, 2009 at 11:15 AM, <m_lists@yahoo.it> wrote: > So the code that I have right now works "pretty well" for the 10 queries of my project, but I guess won't work for generalqueries :( I think that's probably right. >> So, I'm all in favor of what you're trying to conceptually; >> I just >> don't like your proposed implementation. > > What kind of implementation would you propose? I don't really have a proposed implementation in mind; I think it's a hard problem. That's why I suggested looking at the EXPLAIN-ANALYZE-in-progress stuff. By providing a lot more detail, a human being can take a look at that output and make a judgment about what's going on. That's not as easy-to-use as what you're trying to do, but I suspect it's more useful in practice. It might be that after reading a few hundred of those someone could propose some rules of thumb to estimate the percentage of completion, which we could then incorporate back into the system. If not, we lose nothing by implementing that feature first, since it is independently useful. > Thank you very much for your comments. No problem. ...Robert
On Mon, Jun 29, 2009 at 02:07:23PM -0400, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > So, while an actual % completed indicator would be perfect, a "query > > steps completed, current step =" would still be very useful and a large > > improvement over what we have now. > > I think this is pretty much nonsense --- most queries run all their plan > nodes concurrently to some extent. You can't usefully say that a query > is "on" some node, nor measure progress by whether some node is "done". What about showing the outermost node where work has started? -- Josh / eggyknap End Point Corp. www.endpoint.com
Joshua Tolley <eggyknap@gmail.com> writes: > On Mon, Jun 29, 2009 at 02:07:23PM -0400, Tom Lane wrote: >> I think this is pretty much nonsense --- most queries run all their plan >> nodes concurrently to some extent. You can't usefully say that a query >> is "on" some node, nor measure progress by whether some node is "done". > What about showing the outermost node where work has started? That's always the outermost node; what would it tell you? regards, tom lane
On Mon, Jun 29, 2009 at 1:07 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Josh Berkus <josh@agliodbs.com> writes: >> So, while an actual % completed indicator would be perfect, a "query >> steps completed, current step =" would still be very useful and a large >> improvement over what we have now. > > I think this is pretty much nonsense --- most queries run all their plan > nodes concurrently to some extent. You can't usefully say that a query > is "on" some node, nor measure progress by whether some node is "done". What you get in Toad for Oracle is the ability to see long running processes with multiple lines, one per process currently underway. If I recall correctly, the returned information includes what operation is underway (eg, physical reads), the % complete, start time, time remaining and elapsed time. Time remaining has been mostly useless every time I've had to drill down to this level, but otherwise this has been relatively useful information. -- Peter Hunsberger
On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > So, while an actual % completed indicator would be perfect, a "query > > steps completed, current step =" would still be very useful and a large > > improvement over what we have now. > > I think this is pretty much nonsense --- most queries run all their plan > nodes concurrently to some extent. You can't usefully say that a query > is "on" some node, nor measure progress by whether some node is "done". The requirement is not nonsense, even if the detail was slightly off. We can regard plans as acting in phases with each blocking node separating the plan. We know which nodes those are, so we can report that. For each phase, it may be very hard to say what percentage is truly complete, but we could at least report how much work has been done and provide a percentage against planned numbers. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs <simon@2ndQuadrant.com> writes: > On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote: >> I think this is pretty much nonsense --- most queries run all their plan >> nodes concurrently to some extent. You can't usefully say that a query >> is "on" some node, nor measure progress by whether some node is "done". > The requirement is not nonsense, even if the detail was slightly off. I was applying the word "nonsense" to the proposed implementation, not the desire to have query progress indications ... > We can regard plans as acting in phases with each blocking node > separating the plan. We know which nodes those are, so we can report > that. [ shrug... ] You can regard them that way, but you won't get particularly helpful results for a large fraction of real queries. The system is generally set up to prefer "streaming" evaluation as much as it can. Even in nominally blocking nodes like Sort and Hash, there are operational modes that look more like streaming, or at least chunking. regards, tom lane
>> On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote: >>> I think this is pretty much nonsense --- most queries run all their plan >>> nodes concurrently to some extent. You can't usefully say that a query >>> is "on" some node, nor measure progress by whether some node is "done". Right, that was why my proposed interface was to dump out the explain plan with the number of loops, row counts seen so far, and approximate percentage progress. My thinking was that a human could interpret that to understand where the bottleneck is if, say you're still on the first row for the top few nodes but all the nodes below a certain sort have run to completion that the query is busy running the sort... But a tool like psql or pgadmin would receive that and just display the top-level percent progress. pgadmin might actually be able to display its graphical explain with some graphical representation of the percent progress of each node. We can actually do *very* well for percent progress for a lot of nodes. Sequential scans or bitmap scans, for example, can display their actual percent done in terms of disk blocks. The gotcha I ran into was what to do with a nested loop join. The safe thing to do would be to report just the outer child's percentage directly. But that would perform poorly in the not uncommon case where there's one expected outer tuple. If we could trust the outer estimate we could report (outer-percentage + (1/outer-estimate * inner-percentage)) but that will get weird quickly if the outer-percentage turns out to be underestimated. Basically I disagree that imperfect progress reports annoy users. I think we can do better than reporting 250% done or having a percentage that goes backward though. It would be quite tolerable (though perhaps for no logical reason) to have a progress indicator which slows done as it gets closer to 100% and never seems to make it to 100%. -- greg http://mit.edu/~gsstark/resume.pdf
Greg Stark wrote: > Right, that was why my proposed interface was to dump out the explain > plan with the number of loops, row counts seen so far, and approximate > percentage progress. > > My thinking was that a human could interpret that to understand where > the bottleneck is if, say you're still on the first row for the top > few nodes but all the nodes below a certain sort have run to > completion that the query is busy running the sort... +1. Especially if I run it a few times and I can see which counters are still moving. > Basically I disagree that imperfect progress reports annoy users. I > think we can do better than reporting 250% done or having a percentage > that goes backward though. It would be quite tolerable (though perhaps > for no logical reason) to have a progress indicator which slows done > as it gets closer to 100% and never seems to make it to 100%. -1. A counter that slowly goes from 99% to 99.5% done is much worse than a counter that takes the same much time going from "1000% of estimated rows done" to "2000% of estimated rows done". The former just tells me that it lies about how much is done. The latter tells me that it's processing each row quickly but that the estimate was way off.
On Mon, Jun 29, 2009 at 8:15 PM, Ron Mayer<rm_pg@cheapcomplexdevices.com> wrote: > Greg Stark wrote: >> >> Right, that was why my proposed interface was to dump out the explain >> plan with the number of loops, row counts seen so far, and approximate >> percentage progress. >> >> My thinking was that a human could interpret that to understand where >> the bottleneck is if, say you're still on the first row for the top >> few nodes but all the nodes below a certain sort have run to >> completion that the query is busy running the sort... > > +1. Especially if I run it a few times and I can see which counters > are still moving. +1 from me, too, as I said upthread. >> Basically I disagree that imperfect progress reports annoy users. I >> think we can do better than reporting 250% done or having a percentage >> that goes backward though. It would be quite tolerable (though perhaps >> for no logical reason) to have a progress indicator which slows done >> as it gets closer to 100% and never seems to make it to 100%. > > -1. A counter that slowly goes from 99% to 99.5% done is > much worse than a counter that takes the same much time > going from "1000% of estimated rows done" to "2000% of > estimated rows done". > > The former just tells me that it lies about how much is done. > The latter tells me that it's processing each row quickly but > that the estimate was way off. I think both of those options are a little wacky. Maybe 800% **of estimated rows done** is not so bad, since the tag line provides some context, but what does it mean exactly? Rows for the toplevel plan node? That doesn't seem like it would always be too useful. I keep coming back to thinking you need to see the whole tree. ...Robert
Le 30 juin 2009 à 01:34, Greg Stark <gsstark@mit.edu> a écrit : > Basically I disagree that imperfect progress reports annoy users. I > think we can do better than reporting 250% done or having a percentage > that goes backward though. It would be quite tolerable (though perhaps > for no logical reason) to have a progress indicator which slows done > as it gets closer to 100% and never seems to make it to 100%. I guess bad stats are such an important problem in planning queries that a 250% progress is doing more good than harm in showing users how badly they need to review their analyze related settings. Regards, -- dim
> +1. Especially if I run it a few times and I can see > which counters > are still moving. Per-node percentage is easy to do (given the perfect estimates, of course). The problem comes when you want to give an "overall" percentage. I wouldn't know where to put that "explain-like" output though: in a column in pg_stat_get_activity??? (and it would be availableonly if the proper variable was "on" before sending the query) > -1. A counter that slowly goes from 99% to > 99.5% done is > much worse than a counter that takes the same much time > going from "1000% of estimated rows done" to "2000% of > estimated rows done". It's not just about estimates. Even with 100% correct estimates, IMHO there's no way to get the perfect amount of work done so far. And this is even without considering multiple queries running at the same time... If someone has some time to read those papers let me know what he thinks about them... because I think their methods couldn'tgive them those results...
On Tue, 2009-06-30 at 07:04 +0200, Dimitri Fontaine wrote: > Le 30 juin 2009 à 01:34, Greg Stark <gsstark@mit.edu> a écrit : > > Basically I disagree that imperfect progress reports annoy users. I > > think we can do better than reporting 250% done or having a percentage > > that goes backward though. It would be quite tolerable (though perhaps > > for no logical reason) to have a progress indicator which slows done > > as it gets closer to 100% and never seems to make it to 100%. > > I guess bad stats are such an important problem in planning queries > that a 250% progress is doing more good than harm in showing users how > badly they need to review their analyze related settings. Yeh, I agree. We can define it as "planned work", rather than actual. So if the progress bar says 250% and query is still going at least you know it is doing more work, rather than just being slow at doing the planned work. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Mon, 2009-06-29 at 18:49 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote: > >> I think this is pretty much nonsense --- most queries run all their plan > >> nodes concurrently to some extent. You can't usefully say that a query > >> is "on" some node, nor measure progress by whether some node is "done". > > > The requirement is not nonsense, even if the detail was slightly off. > > I was applying the word "nonsense" to the proposed implementation, > not the desire to have query progress indications ... Understood, just trying to limit the blast radius. > > We can regard plans as acting in phases with each blocking node > > separating the plan. We know which nodes those are, so we can report > > that. > > [ shrug... ] You can regard them that way, but you won't get > particularly helpful results for a large fraction of real queries. > The system is generally set up to prefer "streaming" evaluation > as much as it can. Even in nominally blocking nodes like Sort and Hash, > there are operational modes that look more like streaming, or at least > chunking. It's not always useful, though many large queries do have multiple phases. The concept and the name come from ETL tools and it is of real practical use in those environments. We can put the phase number on the EXPLAIN easily, and it is very simple to calculate the total number of phases and the current phase - e.g. 2 of 5 phases complete. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Tom Lane wrote: > Joshua Tolley <eggyknap@gmail.com> writes: > > On Mon, Jun 29, 2009 at 02:07:23PM -0400, Tom Lane wrote: > >> I think this is pretty much nonsense --- most queries run all their plan > >> nodes concurrently to some extent. You can't usefully say that a query > >> is "on" some node, nor measure progress by whether some node is "done". > > > What about showing the outermost node where work has started? > > That's always the outermost node; what would it tell you? [ Repost ] I think the only resonable solution would be to consider the estimated cost of each node and then compute what percentage complete each node is. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, Jul 2, 2009 at 2:32 AM, Bruce Momjian<bruce@momjian.us> wrote: > I think the only resonable solution would be to consider the estimated > cost of each node and then compute what percentage complete each node > is. > Well you can do better for some nodes. A sequential scan for example can tell you exactly what percentage of the way through its scan it is. A sort node that's fnished the sort can produce an value based on both the estimate of the relative costs of the sort vs reading the results and the actual percentage progress reading the results. So I think it has to come down to another ExecProcNode method the way I had it arranged in my patch that actually implemented this. I was partly waiting for the other patch which multiplexed signals onto fewer actual unix signals to go through. And for XML explain plans to go through. Once we have those then I think my patch is actually nearly there, it just needs some additional tweaking of the heuristics for more plan types. Then comes the fun part of figuring out a useful UI for psql and pgadmin. Personally I'm happy for psql to just print the plan whenever the user hits siginfo. I think an apt-style curses progress bar would be unecessarily heavyweight for the lightweight vision I have for psql. But I know others have more ambitious visions for psql. -- greg http://mit.edu/~gsstark/resume.pdf
On Thu, Jul 2, 2009 at 12:48 PM, Euler Taveira de Oliveira<euler@timbira.com> wrote: > I know that it didn't solve the estimation problem but ... IMHO the > [under|over]estimation should be treated by an external tool (autoexplain?). > So when we enable the query progress and some node reports a difference > between estimated and real more than x%, log the plan. Doing it, we will be > helping DBAs to investigate the bad plans. Keep in mind that it is frequently the case that the estimates are substantially off but the plan still works OK. I just put a dirty hack into one of my apps to improve the selectivity estimates by a factor of 200, but they're still off by a factor of 5. Even when they were off by 1000x the bad plan happened only intermittently. You notice the cases where the estimates are off and it makes for a bad plan, but there are lots of other cases where the estimates are off but the plan is still OK. ...Robert