Re: Auto explain after query timeout - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Auto explain after query timeout
Date
Msg-id CA+TgmoYp8KRt=JqsZdfCki_z5xHwkjcpVRXfn9a34d6xiOS1yQ@mail.gmail.com
Whole thread Raw
In response to Re: Auto explain after query timeout  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Tue, Sep 20, 2022 at 5:08 PM James Coleman <jtc331@gmail.com> wrote:
> - A safe explain (e.g., disallow catalog access) that is potentially
> missing information.

This would be pretty useless I think, because you'd be missing all
relation names.

> - A safe way to interrupt queries such as "safe shutdown" of a node
> (e.g., a seq scan could stop returning tuples early) and allow a
> configurable buffer of time after the statement timeout before firing
> a hard abort of the query (and transaction).

This might be useful, but it seems extremely difficult to get working.
You'd not only have to design the safe shutdown mechanism itself, but
also find a way to safely engage it at the right times.

> Alternatively I wonder if it's possible (this would maybe assume no
> catalog changes in the current transaction -- or at least none that
> would be referenced by the current query) to open a new transaction
> (with the same horizon information) and duplicate the plan over to
> that transaction and run the explain there. This way you do it *after*
> the error is raised. That's some serious spit-balling -- I'm not
> saying that's doable, just trying to imagine how one might
> comprehensively address the concerns.

Doesn't work, because the new transaction's snapshot wouldn't be the
same as that of the old one. Imagine that you create a table and run a
query on it in the same transaction. Then you migrate the plan tree to
a new transaction and try to find out the table name. But in the new
transaction, that table doesn't exist: it was destroyed by the
previous rollback.

Honestly I have no very good ideas how to create the feature you want
here. I guess the only thing I can think of is to separate the EXPLAIN
process into two phases: a first phase that runs when the plan tree is
set up and gathers all of the information that we might need later,
like relation names, and then a second phase that runs later when you
want to generate the output and does nothing that can fail, or at
least no database: maybe it's allowed to allocate memory, for example.
But that sounds like a big and perhaps painful refactoring exercise,
and I can imagine that there might be reasons why it doesn't work out.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans
Next
From: Nathan Bossart
Date:
Subject: Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?