Thread: ToDo: log plans of cancelled queries
Hello We use successfully use auto_explain. We miss support cancelled queries - we need same info for queries, that we cancel after x minutes. Subtask of this feature can be dumping currently executed query with plan and with complete query string to log after receiving some signal - maybe SIGTRAP. What do you thinking about this feature? Regards Pavel Stehule
Pavel Stehule <pavel.stehule@gmail.com> writes: > What do you thinking about this feature? The idea of expecting an add-on module to execute operations in an already-failed transaction seems pretty dubious to me. I also think it's not a great idea to add partial executions into a query's stats. For instance, suppose query X has been done 3 times and took a minute each time. The fourth time, the user cancels it after one second. If we now report that the query's average execution time is 45 seconds or so, that seems pretty misleading to me. regards, tom lane
2013/1/11 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> What do you thinking about this feature? > > The idea of expecting an add-on module to execute operations in an > already-failed transaction seems pretty dubious to me. I also think > it's not a great idea to add partial executions into a query's stats. > For instance, suppose query X has been done 3 times and took a minute > each time. The fourth time, the user cancels it after one second. > If we now report that the query's average execution time is 45 seconds > or so, that seems pretty misleading to me. I don't propose logging query time for cancelled queries - although we have path that do it - but it is our specific issue. My propose is proposed for different dimensions and purpose - for example - we have a limit 20 minutes for almost all queries, and after this limit we killing queries. But we have to know little bit more about these bad queries - and we hope, so execution plan can give this additional info. We have same motivation like people who use auto_explain for slow query - but we can't to wait to query complete. Regards Pavel > > regards, tom lane
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > My propose is proposed for different dimensions and purpose - for > example - we have a limit 20 minutes for almost all queries, and after > this limit we killing queries. But we have to know little bit more > about these bad queries - and we hope, so execution plan can give this > additional info. We have same motivation like people who use > auto_explain for slow query - but we can't to wait to query complete. Why not an option to auto_explain (or whatever) to log an execution plan right before actually executing it? If that was something which could be set with a GUC or similar, you could just do that before running whatever queries you're interested in capturing the plans for. Thanks, Stephen
Pavel Stehule <pavel.stehule@gmail.com> writes: > My propose is proposed for different dimensions and purpose - for > example - we have a limit 20 minutes for almost all queries, and after > this limit we killing queries. But we have to know little bit more > about these bad queries - and we hope, so execution plan can give this > additional info. We have same motivation like people who use > auto_explain for slow query - but we can't to wait to query complete. Oh, sorry, not enough caffeine yet --- somehow I was thinking about pg_stat_statements not auto_explain. However, auto_explain is even worse on the other problem. You flat out cannot do catalog lookups in a failed transaction, but there's no way to print a decompiled plan without such lookups. So it won't work. (It would also be appropriate to be suspicious of whether the executor's plan state tree is even fully set up at the time the error is thrown...) regards, tom lane
2013/1/11 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> My propose is proposed for different dimensions and purpose - for >> example - we have a limit 20 minutes for almost all queries, and after >> this limit we killing queries. But we have to know little bit more >> about these bad queries - and we hope, so execution plan can give this >> additional info. We have same motivation like people who use >> auto_explain for slow query - but we can't to wait to query complete. > > Why not an option to auto_explain (or whatever) to log an execution plan > right before actually executing it? If that was something which could > be set with a GUC or similar, you could just do that before running > whatever queries you're interested in capturing the plans for. for our OLAP usage it is probably possible, but it can be slow for OLTP usage.. Regards Pavel > > Thanks, > > Stephen
Tom Lane escribió: > Pavel Stehule <pavel.stehule@gmail.com> writes: > > My propose is proposed for different dimensions and purpose - for > > example - we have a limit 20 minutes for almost all queries, and after > > this limit we killing queries. But we have to know little bit more > > about these bad queries - and we hope, so execution plan can give this > > additional info. We have same motivation like people who use > > auto_explain for slow query - but we can't to wait to query complete. > > Oh, sorry, not enough caffeine yet --- somehow I was thinking about > pg_stat_statements not auto_explain. > > However, auto_explain is even worse on the other problem. You flat out > cannot do catalog lookups in a failed transaction, but there's no way to > print a decompiled plan without such lookups. So it won't work. (It > would also be appropriate to be suspicious of whether the executor's > plan state tree is even fully set up at the time the error is thrown...) Maybe it'd work to save the query source text and parameter values somewhere and log an explain in a different session. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
2013/1/11 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> My propose is proposed for different dimensions and purpose - for >> example - we have a limit 20 minutes for almost all queries, and after >> this limit we killing queries. But we have to know little bit more >> about these bad queries - and we hope, so execution plan can give this >> additional info. We have same motivation like people who use >> auto_explain for slow query - but we can't to wait to query complete. > > Oh, sorry, not enough caffeine yet --- somehow I was thinking about > pg_stat_statements not auto_explain. > > However, auto_explain is even worse on the other problem. You flat out > cannot do catalog lookups in a failed transaction, but there's no way to > print a decompiled plan without such lookups. So it won't work. (It > would also be appropriate to be suspicious of whether the executor's > plan state tree is even fully set up at the time the error is thrown...) yes, it is - I have a few ideas 1) using signal handler - we don't use a SIGTRAP - so we can use SIGTRAP (for example) - and inside signal handler we can ensure dump of plan. It has one advantage - we can take a plan - and maybe more without query cancelling - somebody can have significantly higher limit then we, but he would to know a plan. 2) creating some hook for some operations called before query is really cancelled - this can be called before exception is raised, so we can materialise plan without problem in this moment. This hook can be used for auto_explain, because we don't would to print plans for queries that was faster cancelled than auto_explain limit is it. Maybe this hook can be called before raising more kinds of exceptions - across temp file limits. We are not interested on queries that was finished different errors - because these errors coming quickly usually. From my perspective some possible integration with auto_explain can be nice, I would not to set limits on more places. Regards Pavel > > regards, tom lane
On 11 January 2013 15:54, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Tom Lane escribió: >> Pavel Stehule <pavel.stehule@gmail.com> writes: >> > My propose is proposed for different dimensions and purpose - for >> > example - we have a limit 20 minutes for almost all queries, and after >> > this limit we killing queries. But we have to know little bit more >> > about these bad queries - and we hope, so execution plan can give this >> > additional info. We have same motivation like people who use >> > auto_explain for slow query - but we can't to wait to query complete. >> >> Oh, sorry, not enough caffeine yet --- somehow I was thinking about >> pg_stat_statements not auto_explain. >> >> However, auto_explain is even worse on the other problem. You flat out >> cannot do catalog lookups in a failed transaction, but there's no way to >> print a decompiled plan without such lookups. So it won't work. (It >> would also be appropriate to be suspicious of whether the executor's >> plan state tree is even fully set up at the time the error is thrown...) > > Maybe it'd work to save the query source text and parameter values > somewhere and log an explain in a different session. I think this would be an important feature. But then I also want to be able to kill a query without it doing 50 pushups and a backflip before it dies, since that will inevitably go wrong. Perhaps we can have a new signal that means "exit gracefully, with info if requested". That way we can keep "kill" to mean "kill". An even better feature would be to be able to send a signal to a running query to log its currently executing plan. That way you can ask "Why so slow?" before deciding to kill it. That way we don't need to overload the kill signal at all. That's the most useful part of a "progress indicator" tool without the complexity. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane escribi�: >> However, auto_explain is even worse on the other problem. You flat out >> cannot do catalog lookups in a failed transaction, but there's no way to >> print a decompiled plan without such lookups. So it won't work. (It >> would also be appropriate to be suspicious of whether the executor's >> plan state tree is even fully set up at the time the error is thrown...) > Maybe it'd work to save the query source text and parameter values > somewhere and log an explain in a different session. There wouldn't be a lot of certainty that you got the same plan. AFAICS the only thing you could do is what Stephen suggested: run EXPLAIN *before* starting the query. You could stash the text somewhere and only print it on failure, which would prevent useless log bloat. But it'd still be awfully expensive. regards, tom lane
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > 2013/1/11 Stephen Frost <sfrost@snowman.net>: > > Why not an option to auto_explain (or whatever) to log an execution plan > > right before actually executing it? If that was something which could > > be set with a GUC or similar, you could just do that before running > > whatever queries you're interested in capturing the plans for. > > for our OLAP usage it is probably possible, but it can be slow for OLTP usage.. If it was a GUC, you could turn it on/off at appropriate places in the OLTP scenario. Thanks, Stephen
Simon Riggs <simon@2ndQuadrant.com> writes: > An even better feature would be to be able to send a signal to a > running query to log its currently executing plan. That way you can > ask "Why so slow?" before deciding to kill it. That could conceivably work. At least it wouldn't require running EXPLAIN in a failed transaction. regards, tom lane
2013/1/11 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> 2013/1/11 Stephen Frost <sfrost@snowman.net>: >> > Why not an option to auto_explain (or whatever) to log an execution plan >> > right before actually executing it? If that was something which could >> > be set with a GUC or similar, you could just do that before running >> > whatever queries you're interested in capturing the plans for. >> >> for our OLAP usage it is probably possible, but it can be slow for OLTP usage.. > > If it was a GUC, you could turn it on/off at appropriate places in the > OLTP scenario. yes, it is possible solution Pavel > > Thanks, > > Stephen
2013/1/11 Stephen Frost <sfrost@snowman.net>: > * Pavel Stehule (pavel.stehule@gmail.com) wrote: >> 2013/1/11 Stephen Frost <sfrost@snowman.net>: >> > Why not an option to auto_explain (or whatever) to log an execution plan >> > right before actually executing it? If that was something which could >> > be set with a GUC or similar, you could just do that before running >> > whatever queries you're interested in capturing the plans for. >> >> for our OLAP usage it is probably possible, but it can be slow for OLTP usage.. > > If it was a GUC, you could turn it on/off at appropriate places in the > OLTP scenario. but still - if this is a auto_explain feature - then we need to return execution to auto_explain after cancelled query - and it is probably impossible now Regards Pavel > > Thanks, > > Stephen
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > An even better feature would be to be able to send a signal to a > > running query to log its currently executing plan. That way you can > > ask "Why so slow?" before deciding to kill it. > > That could conceivably work. At least it wouldn't require running > EXPLAIN in a failed transaction. I like this idea, in general, also. Taking that to the next level would be figuring out how you can do the same kind of thing through an interactive psql session where the user running the query doesn't need access to the database server or PG log files... We can send a 'cancel query', how about a 'report on query' which returns the plan and perhaps whatever other stats are easily available? Thanks, Stephen
2013/1/11 Stephen Frost <sfrost@snowman.net>: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Simon Riggs <simon@2ndQuadrant.com> writes: >> > An even better feature would be to be able to send a signal to a >> > running query to log its currently executing plan. That way you can >> > ask "Why so slow?" before deciding to kill it. >> >> That could conceivably work. At least it wouldn't require running >> EXPLAIN in a failed transaction. > > I like this idea, in general, also. Taking that to the next level would > be figuring out how you can do the same kind of thing through an > interactive psql session where the user running the query doesn't need > access to the database server or PG log files... > this is simple - it can be printed via elog(WARNING, ...) to original console > We can send a 'cancel query', how about a 'report on query' which > returns the plan and perhaps whatever other stats are easily available? there is only one question - that POSIX signal we can use? Pavel > > Thanks, > > Stephen
2013/1/11 Pavel Stehule <pavel.stehule@gmail.com>: > 2013/1/11 Stephen Frost <sfrost@snowman.net>: >> * Tom Lane (tgl@sss.pgh.pa.us) wrote: >>> Simon Riggs <simon@2ndQuadrant.com> writes: >>> > An even better feature would be to be able to send a signal to a >>> > running query to log its currently executing plan. That way you can >>> > ask "Why so slow?" before deciding to kill it. >>> >>> That could conceivably work. At least it wouldn't require running >>> EXPLAIN in a failed transaction. >> >> I like this idea, in general, also. Taking that to the next level would >> be figuring out how you can do the same kind of thing through an >> interactive psql session where the user running the query doesn't need >> access to the database server or PG log files... >> > > this is simple - it can be printed via elog(WARNING, ...) to original console theoretically we can show current state via EXPLAIN ANALYSE result > >> We can send a 'cancel query', how about a 'report on query' which >> returns the plan and perhaps whatever other stats are easily available? > > there is only one question - that POSIX signal we can use? > > Pavel > >> >> Thanks, >> >> Stephen
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > 2013/1/11 Stephen Frost <sfrost@snowman.net>: > > We can send a 'cancel query', how about a 'report on query' which > > returns the plan and perhaps whatever other stats are easily available? > > there is only one question - that POSIX signal we can use? This would be a new protocol message, psql doesn't ever send any actual process signals to the backend processes... Or at least, that's how I was thinking it would be implemented, in an ideal world. It's possible we could have some backend helper function which a user could call on another connection to send a signal to the first, after figuring out the pid, blah, blah. Of course, I haven't gone and looked at how cancel query really works under the hood, so I have no idea if this is at all possible. :) Thanks, Stephen
On 11 January 2013 16:31, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2013/1/11 Stephen Frost <sfrost@snowman.net>: >> * Tom Lane (tgl@sss.pgh.pa.us) wrote: >>> Simon Riggs <simon@2ndQuadrant.com> writes: >>> > An even better feature would be to be able to send a signal to a >>> > running query to log its currently executing plan. That way you can >>> > ask "Why so slow?" before deciding to kill it. >>> >>> That could conceivably work. At least it wouldn't require running >>> EXPLAIN in a failed transaction. >> >> I like this idea, in general, also. Taking that to the next level would >> be figuring out how you can do the same kind of thing through an >> interactive psql session where the user running the query doesn't need >> access to the database server or PG log files... >> > > this is simple - it can be printed via elog(WARNING, ...) to original console > >> We can send a 'cancel query', how about a 'report on query' which >> returns the plan and perhaps whatever other stats are easily available? > > there is only one question - that POSIX signal we can use? We already overload the signals, so its just a new type for the signal handler to cope with. See procsignal_sigusr1_handler() If we do it this way we can have time-based explain logging, so log anything that hits 60 minutes etc.. Stephen's idea of an additional protocol message to support this is fine, but I wouldn't want to limit this capability so it can only be invoked from the client. I'd like a sysadmin be able to enquire about other sessions. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 11 January 2013 16:52, Simon Riggs <simon@2ndquadrant.com> wrote: > We already overload the signals, so its just a new type for the signal > handler to cope with. > > See procsignal_sigusr1_handler() I roughed up something to help you here... this is like 50% of a patch. pg_explain_backend() calls a SIGUSR1 variant which then allows a call to RunDynamicExplain() during any call to CHECK_FOR_INTERRUPTS() That only works when something in the executor has called SetDynamicExplain(), which later needs to be Unset... So that's all you need to invoke a dynamic EXPLAIN via a sysadmin function. All you need to do is generate an EXPLAIN and dump it somewhere useful, like the server log. Over to you Pavel. There's a patch somewhere by Greg Stark that generated a plan from a running server for progress bar stuff, but that was like 5 years ago nearly. But the explain invocation stuff from that might be useful as a guide. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Pavel Stehule <pavel.stehule@gmail.com> writes: > My propose is proposed for different dimensions and purpose - for > example - we have a limit 20 minutes for almost all queries, and after > this limit we killing queries. But we have to know little bit more > about these bad queries - and we hope, so execution plan can give this > additional info. We have same motivation like people who use > auto_explain for slow query - but we can't to wait to query complete. Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
2013/1/13 Dimitri Fontaine <dimitri@2ndquadrant.fr>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> My propose is proposed for different dimensions and purpose - for >> example - we have a limit 20 minutes for almost all queries, and after >> this limit we killing queries. But we have to know little bit more >> about these bad queries - and we hope, so execution plan can give this >> additional info. We have same motivation like people who use >> auto_explain for slow query - but we can't to wait to query complete. > > Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done? We have limit 5 sec - but we have to kill queries after 20 minutes - and we have no plan for these queries. Regards Pavel > > Regards, > -- > Dimitri Fontaine > http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> My propose is proposed for different dimensions and purpose - for >> example - we have a limit 20 minutes for almost all queries, and after >> this limit we killing queries. But we have to know little bit more >> about these bad queries - and we hope, so execution plan can give this >> additional info. We have same motivation like people who use >> auto_explain for slow query - but we can't to wait to query complete. > Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done? That would only help if he were willing to wait for the long-running command to be done (instead of canceling it). It's not hard to think of commands that will still be running after the heat death of the universe. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: >> Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done? > > That would only help if he were willing to wait for the long-running > command to be done (instead of canceling it). It's not hard to think > of commands that will still be running after the heat death of the > universe. Oh. Brain fart from me. Sorry. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support