Thread: EXPLAIN ANALYZE
My understanding from 8.2 development was that EXPLAIN ANALYZE had been altered so that if you issued a particular keystroke while it was executing you'd get a partial results-so-far version of the EXPLAIN. AFAICS that wasn't implemented. Or at least I can't find it? The problem raised in 8.2dev was that EXPLAIN ANALYZE takes forever. It still does, which means analyzing queries that run for too long is almost impossible. Those are the ones we really need to analyze... Thoughts? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Fri, Dec 08, 2006 at 10:36:12AM +0000, Simon Riggs wrote: > My understanding from 8.2 development was that EXPLAIN ANALYZE had been > altered so that if you issued a particular keystroke while it was > executing you'd get a partial results-so-far version of the EXPLAIN. Not as I recall. There were some attempts to reduce the overhead, but that's about it. There were some attempts to get an estimate of how far the query had progressed, but that's something else. Not quite sure what you mean. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
"Simon Riggs" <simon@2ndquadrant.com> writes: > My understanding from 8.2 development was that EXPLAIN ANALYZE had been > altered so that if you issued a particular keystroke while it was > executing you'd get a partial results-so-far version of the EXPLAIN. Ways to get partial results from EXPLAIN were speculated about, but AFAIR no specific design was agreed to, much less implemented. regards, tom lane
On Fri, 2006-12-08 at 10:42 -0500, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > My understanding from 8.2 development was that EXPLAIN ANALYZE had been > > altered so that if you issued a particular keystroke while it was > > executing you'd get a partial results-so-far version of the EXPLAIN. > > Ways to get partial results from EXPLAIN were speculated about, but > AFAIR no specific design was agreed to, much less implemented. Well, I'd like a way of making EXPLAIN ANALYZE return something useful within a reasonable amount of time. We can define that as the amount of time that the user considers is their goal for the query. Having an EA that goes on forever and ever doesn't really help anyone diagnose problems with long running queries. With that requirement in mind, how about something simple like: explain_analyze_timeout = n When timeout is reached the EA returns results as if it had finished executing normally, though with some additional text to make it very clear that the displayed results are partial execution only. Implemented over the top of statement_timeout, but without throwing an error. I'm not very sure about that idea, but it meets the requirement. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Sun, 2006-12-10 at 13:00 +0000, Simon Riggs wrote: > With that requirement in mind, how about something simple like: > > explain_analyze_timeout = n > > When timeout is reached the EA returns results as if it had finished > executing normally, though with some additional text to make it very > clear that the displayed results are partial execution only. Sounds like a bit of a kludge, IMHO. EXPLAIN ANALYZE isn't designed to provide online feedback about the execution of a query, which I think is the real feature here. There has been prior discussion about how to do online progress indication for queries in general[1] -- I think some sort of "online EXPLAIN ANALYZE" would be a natural part of such a feature. -Neil [1] http://developer.postgresql.org/index.php/Query_progress_indication
On Sun, 2006-12-10 at 09:28 -0500, Neil Conway wrote: > On Sun, 2006-12-10 at 13:00 +0000, Simon Riggs wrote: > > With that requirement in mind, how about something simple like: > > > > explain_analyze_timeout = n > > > > When timeout is reached the EA returns results as if it had finished > > executing normally, though with some additional text to make it very > > clear that the displayed results are partial execution only. > > Sounds like a bit of a kludge, IMHO. Without a doubt. I was hoping for some further inspiration... How about any of these? EXPLAIN [ ANALYZE [TIME LIMIT n]] [ VERBOSE ] statement EXPLAIN [ ANALYZE [SHOW STATISTICS AFTER n]] [ VERBOSE ] statement Neither of which need new keywords. > EXPLAIN ANALYZE isn't designed to > provide online feedback about the execution of a query, which I think is > the real feature here. There has been prior discussion about how to do > online progress indication for queries in general[1] -- I think some > sort of "online EXPLAIN ANALYZE" would be a natural part of such a > feature. > > -Neil > > [1] http://developer.postgresql.org/index.php/Query_progress_indication Well, that all looks good and can see I'd want all of those things. The EA case is pretty straightforward though; we don't really need regular feedback as much as *any* feedback. So, I'm seeing it as a very simple case of the overall requirement. It's also a special case in that the output from a partial EA needs to be the same shape as a normal EA. Query progress data would likely be a different shape, no? You'd need some estimates of eventual time based upon a re-evaluation of the original estimates based upon things learned so far during execution. i.e. at start we thought this loop would be called X times (planner estimate), so far its been called Y times (EA actual) and based upon that we now think it will be called Z times. So I'm thinking of enhancing EA to provide partial results rather than go for the Full Monty just yet. BTW, can anybody edit the Wiki? I didn't realise it existed. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > The EA case is pretty straightforward though; Well, no its not, as you'll recall if you re-read the prior discussions. The killer problem is that it's unclear whether the early termination of the query represents an error condition or not. If it's not an error then you've got a serious problem for non-SELECT queries (which EA actually executes, remember) --- you'll have allowed an incompletely executed update to become committed, which is as good a definition of "data corruption" as I can come up with offhand. On the other hand, if it is an error then delivering some results along with the error requires serious contortion of the FE/BE protocol, libpq's response to errors, etc. To say nothing of what it might take to do it inside the backend, which generally does not like doing anything interesting in an already-aborted transaction. We might be able to finesse the protocol problem by teaching EA to respond to query cancel by emitting the data-so-far as a NOTICE (like it used to do many moons ago), rather than a standard query result, then allowing the query to error out. However this'd be fairly unfriendly for client-side tools that are expecting a query result. regards, tom lane
Simon Riggs wrote: > Well, I'd like a way of making EXPLAIN ANALYZE return something > useful within a reasonable amount of time. We can define that as the > amount of time that the user considers is their goal for the query. What sort of "useful" results would you expect to be able to see from such an aborted EXPLAIN ANALYZE? I cannot quite imagine what instructive value a partially executed plan output would have. It's not like we can somehow ensure executing an equal proportion of each plan node or something. Do you have a specific case in mind? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Tom Lane wrote: > > We might be able to finesse the protocol problem by teaching EA to > respond to query cancel by emitting the data-so-far as a NOTICE (like it > used to do many moons ago), rather than a standard query result, then > allowing the query to error out. However this'd be fairly unfriendly > for client-side tools that are expecting a query result. > wouldn't the above be ok - given that the primary use for this EA extension is troublesome query debugging anyway ? Cheers Mark
On Sun, 2006-12-10 at 18:09 -0500, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > The EA case is pretty straightforward though; > > Well, no its not, as you'll recall if you re-read the prior discussions. > The killer problem is that it's unclear whether the early termination of > the query represents an error condition or not. If it's not an error > then you've got a serious problem for non-SELECT queries (which EA > actually executes, remember) I was expecting to take the non-ERROR route, realising all of the problems you mention make the issue so costly to solve that way. Most big queries are SELECTs without any updates and its those we care about here. Anybody doing huge updates needs other assistance, IMHO. Intermediate results are always better than none at all. I do understand what a partial execution would look like - frequently it is the preparatory stages that slow a query down - costly sorts, underestimated hash joins etc. Other times it is loop underestimation, which can usually be seen fairly quickly. I foresaw that it would be possible to enforce EA as a read-only transaction, or throw an error (with no output) when issued in time-limited form. Maybe that isn't possible. Yes, I am looking for a fix that can be accomplished without major work and/or change. This topic is a pain, but not such a priority feature. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > We might be able to finesse the protocol problem by teaching EA to > respond to query cancel by emitting the data-so-far as a NOTICE (like it > used to do many moons ago), rather than a standard query result, then > allowing the query to error out. However this'd be fairly unfriendly > for client-side tools that are expecting a query result. What I suggested was introducing a new FE/BE message type for analyze query plans. Then clients that recognize it can use it to display the query plan without interfering with the query results. Clients that don't know what to do with it would have to just ignore it. Then we could introduce as many ways of triggering these messages as we like. A GUC to trigger one every n seconds, a FE/BE message like QueryCancel, say, QueryProbe which triggers one when the user presses a button in pgadmin or C-t (SIGINFO) in psql, etc. I was thinking that it should be more structured than the current block of text that clients receive. I had in mind to make it equivalent to a PGResult so the various bits of data would be in different named columns. This would let GUI clients like pgadmin interpret the results more effectively and make it easier for us to add data without worrying about information overload on the user's side. And the query would keep operating. Canceling the query and statement_timeout would both be entirely orthogonal to requesting analyze results. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > Intermediate results are always better than none at all. I do understand > what a partial execution would look like - frequently it is the > preparatory stages that slow a query down - costly sorts, underestimated > hash joins etc. Other times it is loop underestimation, which can > usually be seen fairly quickly. Surely all you're interested in is where the actual plan differs from the expected plan? Could you not just have a mode that issues NOTICEs when expected/actual number of rows differ by more than a set amount? You'd probably want two NOTICEs - one when the threshold is exceeded, one when the node completes. -- Richard Huxton Archonet Ltd
On Mon, 2006-12-11 at 11:00 +0000, Gregory Stark wrote: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > > > We might be able to finesse the protocol problem by teaching EA to > > respond to query cancel by emitting the data-so-far as a NOTICE (like it > > used to do many moons ago), rather than a standard query result, then > > allowing the query to error out. However this'd be fairly unfriendly > > for client-side tools that are expecting a query result. > > What I suggested was introducing a new FE/BE message type for analyze query > plans. Then clients that recognize it can use it to display the query plan > without interfering with the query results. Clients that don't know what to do > with it would have to just ignore it. > > Then we could introduce as many ways of triggering these messages as we like. > A GUC to trigger one every n seconds, a FE/BE message like QueryCancel, say, > QueryProbe which triggers one when the user presses a button in pgadmin or C-t > (SIGINFO) in psql, etc. > > I was thinking that it should be more structured than the current block of > text that clients receive. I had in mind to make it equivalent to a PGResult > so the various bits of data would be in different named columns. This would > let GUI clients like pgadmin interpret the results more effectively and make > it easier for us to add data without worrying about information overload on > the user's side. > > And the query would keep operating. Canceling the query and statement_timeout > would both be entirely orthogonal to requesting analyze results. I like the idea, but its more work than I really wanted to get into right now. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Mon, 2006-12-11 at 11:00 +0000, Gregory Stark wrote: >> What I suggested was introducing a new FE/BE message type for analyze query >> plans. > I like the idea, but its more work than I really wanted to get into > right now. Yeah ... a protocol change is *painful*, especially if you really want clients to behave in a significantly new way. regards, tom lane
Tom Lane wrote: > Yeah ... a protocol change is *painful*, especially if you really want > clients to behave in a significantly new way. A backward-incompatible protocol change is painful, sure, but ISTM we could implement what Greg describes as a straightforward extension to the V3 protocol. Then the backend could just avoid sending the query progress information to < V4 protocol clients. -Neil
Simon Riggs wrote: > I like the idea, but its more work than I really wanted to get into > right now. Well, from another point of view: do we need this feature so urgently that there is not enough time to do it properly? IMHO, no. -Neil
Neil Conway <neilc@samurai.com> writes: > Tom Lane wrote: >> Yeah ... a protocol change is *painful*, especially if you really want >> clients to behave in a significantly new way. > A backward-incompatible protocol change is painful, sure, but ISTM we > could implement what Greg describes as a straightforward extension to > the V3 protocol. Then the backend could just avoid sending the query > progress information to < V4 protocol clients. You're dodging the point though. If you want the new message type to do anything useful in V4 clients, you still have to define an API for libpq, update psql, try to figure out what the heck JDBC and ODBC are going to do with it, etc etc. All doable, but it's a lot more work than just a quick hack in the backend. regards, tom lane
Richard Huxton wrote: > Simon Riggs wrote: > > Intermediate results are always better than none at all. I do understand > > what a partial execution would look like - frequently it is the > > preparatory stages that slow a query down - costly sorts, underestimated > > hash joins etc. Other times it is loop underestimation, which can > > usually be seen fairly quickly. > > Surely all you're interested in is where the actual plan differs from > the expected plan? Could you not just have a mode that issues NOTICEs > when expected/actual number of rows differ by more than a set amount? > You'd probably want two NOTICEs - one when the threshold is exceeded, > one when the node completes. Right, we already have a TODO: * Have EXPLAIN ANALYZE highlight poor optimizer estimates I was thinking we could issue NOTICE when the estimates differed from the actual by a specified percentage, and that NOTICE could be issued while the query is still processing, assuming the stage completes before the query does. This seems much easier than doing protocol changes. TODO updated: * Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Tue, 2006-12-12 at 17:30 -0500, Bruce Momjian wrote: > * Have EXPLAIN ANALYZE highlight poor optimizer estimates > TODO updated: > > * Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and > actual row counts differ by a specified percentage I don't think this is an improvement. The old wording describes a broad set of possible improvements. Your new text describes one way of implementing a subset of the former TODO wording. -Neil
Neil Conway wrote: > On Tue, 2006-12-12 at 17:30 -0500, Bruce Momjian wrote: > > * Have EXPLAIN ANALYZE highlight poor optimizer estimates > > > TODO updated: > > > > * Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and > > actual row counts differ by a specified percentage > > I don't think this is an improvement. The old wording describes a broad > set of possible improvements. Your new text describes one way of > implementing a subset of the former TODO wording. Well, we can still do a broader implementation if we want it. Do you have any suggestions? Basically, the more specific, the more likely we will get someone to do it, and we can always add more details. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Mon, Dec 11, 2006 at 12:24:12AM +0100, Peter Eisentraut wrote: > Simon Riggs wrote: > > Well, I'd like a way of making EXPLAIN ANALYZE return something > > useful within a reasonable amount of time. We can define that as the > > amount of time that the user considers is their goal for the query. > > What sort of "useful" results would you expect to be able to see from > such an aborted EXPLAIN ANALYZE? I cannot quite imagine what > instructive value a partially executed plan output would have. It's > not like we can somehow ensure executing an equal proportion of each > plan node or something. Do you have a specific case in mind? The query is most likely to get canceled while it is working on whatever node in the plan is the bottleneck, and it's likely going to be easy to spot since nodes above it wouldn't have gotten much done. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Thumbs up on this from a lurker. I recall a previous post about some sort of "progress bar" hack that would show you where in a plan a currently executing query was at. Has any work been done on this? Josh Reich Jim C. Nasby wrote: > On Mon, Dec 11, 2006 at 12:24:12AM +0100, Peter Eisentraut wrote: > >> Simon Riggs wrote: >> >>> Well, I'd like a way of making EXPLAIN ANALYZE return something >>> useful within a reasonable amount of time. We can define that as the >>> amount of time that the user considers is their goal for the query. >>> >> What sort of "useful" results would you expect to be able to see from >> such an aborted EXPLAIN ANALYZE? I cannot quite imagine what >> instructive value a partially executed plan output would have. It's >> not like we can somehow ensure executing an equal proportion of each >> plan node or something. Do you have a specific case in mind? >> > > The query is most likely to get canceled while it is working on whatever > node in the plan is the bottleneck, and it's likely going to be easy to > spot since nodes above it wouldn't have gotten much done. >