Thread: ToDo: log plans of cancelled queries

ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
Hello

We use successfully use auto_explain. We miss support cancelled
queries - we need same info for queries, that we cancel after x
minutes.

Subtask of this feature can be dumping currently executed query with
plan and with complete query string to log after receiving some signal
- maybe SIGTRAP.

What do you thinking about this feature?

Regards

Pavel Stehule



Re: ToDo: log plans of cancelled queries

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> What do you thinking about this feature?

The idea of expecting an add-on module to execute operations in an
already-failed transaction seems pretty dubious to me.  I also think
it's not a great idea to add partial executions into a query's stats.
For instance, suppose query X has been done 3 times and took a minute
each time.  The fourth time, the user cancels it after one second.
If we now report that the query's average execution time is 45 seconds
or so, that seems pretty misleading to me.
        regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> What do you thinking about this feature?
>
> The idea of expecting an add-on module to execute operations in an
> already-failed transaction seems pretty dubious to me.  I also think
> it's not a great idea to add partial executions into a query's stats.
> For instance, suppose query X has been done 3 times and took a minute
> each time.  The fourth time, the user cancels it after one second.
> If we now report that the query's average execution time is 45 seconds
> or so, that seems pretty misleading to me.

I don't propose logging query time for cancelled queries - although we
have path that do it - but it is our specific issue.

My propose is proposed for different dimensions and purpose - for
example - we have a limit 20 minutes for almost all queries, and after
this limit we killing queries. But we have to know little bit more
about these bad queries - and we hope, so execution plan can give this
additional info. We have same motivation like people who use
auto_explain for slow query - but we can't to wait to query complete.

Regards

Pavel

>
>                         regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Stephen Frost
Date:
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> My propose is proposed for different dimensions and purpose - for
> example - we have a limit 20 minutes for almost all queries, and after
> this limit we killing queries. But we have to know little bit more
> about these bad queries - and we hope, so execution plan can give this
> additional info. We have same motivation like people who use
> auto_explain for slow query - but we can't to wait to query complete.

Why not an option to auto_explain (or whatever) to log an execution plan
right before actually executing it?  If that was something which could
be set with a GUC or similar, you could just do that before running
whatever queries you're interested in capturing the plans for.
Thanks,
    Stephen

Re: ToDo: log plans of cancelled queries

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> My propose is proposed for different dimensions and purpose - for
> example - we have a limit 20 minutes for almost all queries, and after
> this limit we killing queries. But we have to know little bit more
> about these bad queries - and we hope, so execution plan can give this
> additional info. We have same motivation like people who use
> auto_explain for slow query - but we can't to wait to query complete.

Oh, sorry, not enough caffeine yet --- somehow I was thinking about
pg_stat_statements not auto_explain.

However, auto_explain is even worse on the other problem.  You flat out
cannot do catalog lookups in a failed transaction, but there's no way to
print a decompiled plan without such lookups.  So it won't work.  (It
would also be appropriate to be suspicious of whether the executor's
plan state tree is even fully set up at the time the error is thrown...)
        regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Stephen Frost <sfrost@snowman.net>:
> * Pavel Stehule (pavel.stehule@gmail.com) wrote:
>> My propose is proposed for different dimensions and purpose - for
>> example - we have a limit 20 minutes for almost all queries, and after
>> this limit we killing queries. But we have to know little bit more
>> about these bad queries - and we hope, so execution plan can give this
>> additional info. We have same motivation like people who use
>> auto_explain for slow query - but we can't to wait to query complete.
>
> Why not an option to auto_explain (or whatever) to log an execution plan
> right before actually executing it?  If that was something which could
> be set with a GUC or similar, you could just do that before running
> whatever queries you're interested in capturing the plans for.

for our OLAP usage it is probably possible, but it can be slow for OLTP usage..

Regards

Pavel

>
>         Thanks,
>
>                 Stephen



Re: ToDo: log plans of cancelled queries

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
> > My propose is proposed for different dimensions and purpose - for
> > example - we have a limit 20 minutes for almost all queries, and after
> > this limit we killing queries. But we have to know little bit more
> > about these bad queries - and we hope, so execution plan can give this
> > additional info. We have same motivation like people who use
> > auto_explain for slow query - but we can't to wait to query complete.
>
> Oh, sorry, not enough caffeine yet --- somehow I was thinking about
> pg_stat_statements not auto_explain.
>
> However, auto_explain is even worse on the other problem.  You flat out
> cannot do catalog lookups in a failed transaction, but there's no way to
> print a decompiled plan without such lookups.  So it won't work.  (It
> would also be appropriate to be suspicious of whether the executor's
> plan state tree is even fully set up at the time the error is thrown...)

Maybe it'd work to save the query source text and parameter values
somewhere and log an explain in a different session.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> My propose is proposed for different dimensions and purpose - for
>> example - we have a limit 20 minutes for almost all queries, and after
>> this limit we killing queries. But we have to know little bit more
>> about these bad queries - and we hope, so execution plan can give this
>> additional info. We have same motivation like people who use
>> auto_explain for slow query - but we can't to wait to query complete.
>
> Oh, sorry, not enough caffeine yet --- somehow I was thinking about
> pg_stat_statements not auto_explain.
>
> However, auto_explain is even worse on the other problem.  You flat out
> cannot do catalog lookups in a failed transaction, but there's no way to
> print a decompiled plan without such lookups.  So it won't work.  (It
> would also be appropriate to be suspicious of whether the executor's
> plan state tree is even fully set up at the time the error is thrown...)

yes, it is - I have a few ideas

1) using signal handler - we don't use a SIGTRAP - so we can use
SIGTRAP (for example) - and inside signal handler we can ensure dump
of plan. It has one advantage - we can take a plan - and maybe more
without query cancelling - somebody can have significantly higher
limit then we, but he would to know a plan.

2) creating some hook for some operations called before query is
really cancelled - this can be called before exception is raised, so
we can materialise plan without problem in this moment. This hook can
be used for auto_explain, because we don't would to print plans for
queries that was faster cancelled than auto_explain limit is it. Maybe
this hook can be called before raising more kinds of exceptions -
across temp file limits. We are not interested on queries that was
finished different errors - because these errors coming quickly
usually.

From my perspective some possible integration with auto_explain can be
nice, I would not to set limits on more places.

Regards

Pavel


>
>                         regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Simon Riggs
Date:
On 11 January 2013 15:54, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Tom Lane escribió:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> > My propose is proposed for different dimensions and purpose - for
>> > example - we have a limit 20 minutes for almost all queries, and after
>> > this limit we killing queries. But we have to know little bit more
>> > about these bad queries - and we hope, so execution plan can give this
>> > additional info. We have same motivation like people who use
>> > auto_explain for slow query - but we can't to wait to query complete.
>>
>> Oh, sorry, not enough caffeine yet --- somehow I was thinking about
>> pg_stat_statements not auto_explain.
>>
>> However, auto_explain is even worse on the other problem.  You flat out
>> cannot do catalog lookups in a failed transaction, but there's no way to
>> print a decompiled plan without such lookups.  So it won't work.  (It
>> would also be appropriate to be suspicious of whether the executor's
>> plan state tree is even fully set up at the time the error is thrown...)
>
> Maybe it'd work to save the query source text and parameter values
> somewhere and log an explain in a different session.

I think this would be an important feature.

But then I also want to be able to kill a query without it doing 50
pushups and a backflip before it dies, since that will inevitably go
wrong.

Perhaps we can have a new signal that means "exit gracefully, with
info if requested". That way we can keep "kill" to mean "kill".

An even better feature would be to be able to send a signal to a
running query to log its currently executing plan. That way you can
ask "Why so slow?" before deciding to kill it. That way we don't need
to overload the kill signal at all. That's the most useful part of a
"progress indicator" tool without the complexity.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: ToDo: log plans of cancelled queries

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane escribi�:
>> However, auto_explain is even worse on the other problem.  You flat out
>> cannot do catalog lookups in a failed transaction, but there's no way to
>> print a decompiled plan without such lookups.  So it won't work.  (It
>> would also be appropriate to be suspicious of whether the executor's
>> plan state tree is even fully set up at the time the error is thrown...)

> Maybe it'd work to save the query source text and parameter values
> somewhere and log an explain in a different session.

There wouldn't be a lot of certainty that you got the same plan.

AFAICS the only thing you could do is what Stephen suggested: run
EXPLAIN *before* starting the query.  You could stash the text somewhere
and only print it on failure, which would prevent useless log bloat.
But it'd still be awfully expensive.
        regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Stephen Frost
Date:
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> 2013/1/11 Stephen Frost <sfrost@snowman.net>:
> > Why not an option to auto_explain (or whatever) to log an execution plan
> > right before actually executing it?  If that was something which could
> > be set with a GUC or similar, you could just do that before running
> > whatever queries you're interested in capturing the plans for.
>
> for our OLAP usage it is probably possible, but it can be slow for OLTP usage..

If it was a GUC, you could turn it on/off at appropriate places in the
OLTP scenario.
Thanks,
    Stephen

Re: ToDo: log plans of cancelled queries

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> An even better feature would be to be able to send a signal to a
> running query to log its currently executing plan. That way you can
> ask "Why so slow?" before deciding to kill it.

That could conceivably work.  At least it wouldn't require running
EXPLAIN in a failed transaction.
        regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Stephen Frost <sfrost@snowman.net>:
> * Pavel Stehule (pavel.stehule@gmail.com) wrote:
>> 2013/1/11 Stephen Frost <sfrost@snowman.net>:
>> > Why not an option to auto_explain (or whatever) to log an execution plan
>> > right before actually executing it?  If that was something which could
>> > be set with a GUC or similar, you could just do that before running
>> > whatever queries you're interested in capturing the plans for.
>>
>> for our OLAP usage it is probably possible, but it can be slow for OLTP usage..
>
> If it was a GUC, you could turn it on/off at appropriate places in the
> OLTP scenario.

yes, it is possible solution

Pavel

>
>         Thanks,
>
>                 Stephen



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Stephen Frost <sfrost@snowman.net>:
> * Pavel Stehule (pavel.stehule@gmail.com) wrote:
>> 2013/1/11 Stephen Frost <sfrost@snowman.net>:
>> > Why not an option to auto_explain (or whatever) to log an execution plan
>> > right before actually executing it?  If that was something which could
>> > be set with a GUC or similar, you could just do that before running
>> > whatever queries you're interested in capturing the plans for.
>>
>> for our OLAP usage it is probably possible, but it can be slow for OLTP usage..
>
> If it was a GUC, you could turn it on/off at appropriate places in the
> OLTP scenario.

but still - if this is a auto_explain feature - then we need to return
execution to auto_explain after cancelled query - and it is probably
impossible now

Regards

Pavel

>
>         Thanks,
>
>                 Stephen



Re: ToDo: log plans of cancelled queries

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > An even better feature would be to be able to send a signal to a
> > running query to log its currently executing plan. That way you can
> > ask "Why so slow?" before deciding to kill it.
>
> That could conceivably work.  At least it wouldn't require running
> EXPLAIN in a failed transaction.

I like this idea, in general, also.  Taking that to the next level would
be figuring out how you can do the same kind of thing through an
interactive psql session where the user running the query doesn't need
access to the database server or PG log files...

We can send a 'cancel query', how about a 'report on query' which
returns the plan and perhaps whatever other stats are easily available?
Thanks,
    Stephen

Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Stephen Frost <sfrost@snowman.net>:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>> > An even better feature would be to be able to send a signal to a
>> > running query to log its currently executing plan. That way you can
>> > ask "Why so slow?" before deciding to kill it.
>>
>> That could conceivably work.  At least it wouldn't require running
>> EXPLAIN in a failed transaction.
>
> I like this idea, in general, also.  Taking that to the next level would
> be figuring out how you can do the same kind of thing through an
> interactive psql session where the user running the query doesn't need
> access to the database server or PG log files...
>

this is simple - it can be printed via elog(WARNING, ...) to original console

> We can send a 'cancel query', how about a 'report on query' which
> returns the plan and perhaps whatever other stats are easily available?

there is only one question - that POSIX signal we can use?

Pavel

>
>         Thanks,
>
>                 Stephen



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/11 Pavel Stehule <pavel.stehule@gmail.com>:
> 2013/1/11 Stephen Frost <sfrost@snowman.net>:
>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> > An even better feature would be to be able to send a signal to a
>>> > running query to log its currently executing plan. That way you can
>>> > ask "Why so slow?" before deciding to kill it.
>>>
>>> That could conceivably work.  At least it wouldn't require running
>>> EXPLAIN in a failed transaction.
>>
>> I like this idea, in general, also.  Taking that to the next level would
>> be figuring out how you can do the same kind of thing through an
>> interactive psql session where the user running the query doesn't need
>> access to the database server or PG log files...
>>
>
> this is simple - it can be printed via elog(WARNING, ...) to original console

theoretically we can show current state via EXPLAIN ANALYSE result

>
>> We can send a 'cancel query', how about a 'report on query' which
>> returns the plan and perhaps whatever other stats are easily available?
>
> there is only one question - that POSIX signal we can use?
>
> Pavel
>
>>
>>         Thanks,
>>
>>                 Stephen



Re: ToDo: log plans of cancelled queries

From
Stephen Frost
Date:
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> 2013/1/11 Stephen Frost <sfrost@snowman.net>:
> > We can send a 'cancel query', how about a 'report on query' which
> > returns the plan and perhaps whatever other stats are easily available?
>
> there is only one question - that POSIX signal we can use?

This would be a new protocol message, psql doesn't ever send any actual
process signals to the backend processes...

Or at least, that's how I was thinking it would be implemented, in an
ideal world.  It's possible we could have some backend helper function
which a user could call on another connection to send a signal to the
first, after figuring out the pid, blah, blah.

Of course, I haven't gone and looked at how cancel query really works
under the hood, so I have no idea if this is at all possible. :)
Thanks,
    Stephen

Re: ToDo: log plans of cancelled queries

From
Simon Riggs
Date:
On 11 January 2013 16:31, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2013/1/11 Stephen Frost <sfrost@snowman.net>:
>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> > An even better feature would be to be able to send a signal to a
>>> > running query to log its currently executing plan. That way you can
>>> > ask "Why so slow?" before deciding to kill it.
>>>
>>> That could conceivably work.  At least it wouldn't require running
>>> EXPLAIN in a failed transaction.
>>
>> I like this idea, in general, also.  Taking that to the next level would
>> be figuring out how you can do the same kind of thing through an
>> interactive psql session where the user running the query doesn't need
>> access to the database server or PG log files...
>>
>
> this is simple - it can be printed via elog(WARNING, ...) to original console
>
>> We can send a 'cancel query', how about a 'report on query' which
>> returns the plan and perhaps whatever other stats are easily available?
>
> there is only one question - that POSIX signal we can use?

We already overload the signals, so its just a new type for the signal
handler to cope with.

See procsignal_sigusr1_handler()

If we do it this way we can have time-based explain logging, so log
anything that hits 60 minutes etc..

Stephen's idea of an additional protocol message to support this is
fine, but I wouldn't want to limit this capability so it can only be
invoked from the client. I'd like a sysadmin be able to enquire about
other sessions.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: ToDo: log plans of cancelled queries

From
Simon Riggs
Date:
On 11 January 2013 16:52, Simon Riggs <simon@2ndquadrant.com> wrote:

> We already overload the signals, so its just a new type for the signal
> handler to cope with.
>
> See procsignal_sigusr1_handler()

I roughed up something to help you here... this is like 50% of a patch.

pg_explain_backend() calls a SIGUSR1 variant which then allows a call
to RunDynamicExplain() during any call to CHECK_FOR_INTERRUPTS()

That only works when something in the executor has called
SetDynamicExplain(), which later needs to be Unset...

So that's all you need to invoke a dynamic EXPLAIN via a sysadmin
function. All you need to do is generate an EXPLAIN and dump it
somewhere useful, like the server log. Over to you Pavel. There's a
patch somewhere by Greg Stark that generated a plan from a running
server for progress bar stuff, but that was like 5 years ago nearly.
But the explain invocation stuff from that might be useful as a guide.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: ToDo: log plans of cancelled queries

From
Dimitri Fontaine
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> My propose is proposed for different dimensions and purpose - for
> example - we have a limit 20 minutes for almost all queries, and after
> this limit we killing queries. But we have to know little bit more
> about these bad queries - and we hope, so execution plan can give this
> additional info. We have same motivation like people who use
> auto_explain for slow query - but we can't to wait to query complete.

Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: ToDo: log plans of cancelled queries

From
Pavel Stehule
Date:
2013/1/13 Dimitri Fontaine <dimitri@2ndquadrant.fr>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> My propose is proposed for different dimensions and purpose - for
>> example - we have a limit 20 minutes for almost all queries, and after
>> this limit we killing queries. But we have to know little bit more
>> about these bad queries - and we hope, so execution plan can give this
>> additional info. We have same motivation like people who use
>> auto_explain for slow query - but we can't to wait to query complete.
>
> Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done?

We have limit 5 sec - but we have to kill queries after 20 minutes -
and we have no plan for these queries.

Regards

Pavel


>
> Regards,
> --
> Dimitri Fontaine
> http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: ToDo: log plans of cancelled queries

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> My propose is proposed for different dimensions and purpose - for
>> example - we have a limit 20 minutes for almost all queries, and after
>> this limit we killing queries. But we have to know little bit more
>> about these bad queries - and we hope, so execution plan can give this
>> additional info. We have same motivation like people who use
>> auto_explain for slow query - but we can't to wait to query complete.

> Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done?

That would only help if he were willing to wait for the long-running
command to be done (instead of canceling it).  It's not hard to think
of commands that will still be running after the heat death of the
universe.
        regards, tom lane



Re: ToDo: log plans of cancelled queries

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Set auto_explain.log_min_duration to 19 mins or maybe 17 and be done?
>
> That would only help if he were willing to wait for the long-running
> command to be done (instead of canceling it).  It's not hard to think
> of commands that will still be running after the heat death of the
> universe.

Oh. Brain fart from me. Sorry.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support