Re: Auto explain after query timeout - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Auto explain after query timeout
Date
Msg-id CA+TgmoYW_rSOW4JMQ9_0Df9PKQ=sQDOKUGA4Gc9D8w4wui8fSA@mail.gmail.com
Whole thread Raw
In response to Re: Auto explain after query timeout  (James Coleman <jtc331@gmail.com>)
Responses Re: Auto explain after query timeout
List pgsql-hackers
On Tue, Sep 20, 2022 at 2:35 PM James Coleman <jtc331@gmail.com> wrote:
> Either I'm missing something (and/or this was fixed in a later PG
> version), but I don't think this is how it works. We have this
> specific problem now: we set auto_explain.log_min_duration to 200 (ms)
> and statement_timeout set to 30s, but when a statement times out we do
> not get the plan logged with auto-explain.

I think you're correct. auto_explain uses the ExecutorEnd hook, but
that hook will not be fired in the case of an error. Indeed, if an
error has already been thrown, it would be unsafe to try to
auto-explain anything. For instance -- and this is just one problem of
probably many -- ExplainTargetRel() performs catalog lookups, which is
not OK in a failed transaction.

To make this work, I think you'd need a hook that fires just BEFORE
the error is thrown. However, previous attempts to introduce hooks
into ProcessInterrupts() have not met with a wam response from Tom, so
it might be a tough sell. And maybe for good reason. I see at least
two problems. The first is that explaining a query is a pretty
complicated operation that involves catalog lookups and lots of
complicated stuff, and I don't think that it would be safe to do all
of that at any arbitrary point in the code where ProcessInterrupts()
happened to fire. What if one of the syscache lookups misses the cache
and we have to open the underlying catalog? Then
AcceptInvalidationMessages() will fire, but we don't currently assume
that any old CHECK_FOR_INTERRUPTS() can process invalidations. What if
the running query has called a user-defined function or procedure
which is running DDL which is in the middle of changing catalog state
for some relation involved in the query at the moment that the
statement timeout arrives? I feel like there are big problems here.

The other problem I see is that ProcessInterrupts() is our mechanism
for allowing people to escape from queries that would otherwise run
forever by hitting ^C. But what if the explain code goes crazy and
itself needs to be interrupted, when we're already inside
ProcessInterrupts()? Maybe that would work out OK... or maybe it
wouldn't. I'm not really sure. But it's another reason to be very,
very cautious about putting complicated logic inside
ProcessInterrupts().

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: A question about wording in messages
Next
From: "Gurjeet"
Date:
Subject: Re: Auto explain after query timeout