Thread: making EXPLAIN extensible

making EXPLAIN extensible

From
Robert Haas
Date:
Prior to PostgreSQL 10, EXPLAIN had just 2 options: VACUUM and
ANALYZE. Now, we're up to 12 options, which is already quite a lot,
and there's plenty more things that somebody might like to do.
However, not all of those things necessarily need to be part of the
core code. My original reason for wanting to extend EXPLAIN was that I
was thinking about an extension that would want to do a bunch of
things and one of those things would be to add some information to the
EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
option whose whole purpose is to cater to the needs of some extension,
so that made me think of providing some extensibility infrastructure.

However, there are other use cases, too, basically any of the normal
reasons why extensibility is useful and desirable. You might need to
get some information out a query plan that 99% of people don't care
about. You could come up with your own way of formatting a query plan,
but that's a big pain. It's a lot nicer if you can just add the detail
that you care about to the EXPLAIN output without needing to modify
PostgreSQL itself. Even if you think of something that really ought to
be included in the EXPLAIN output by PostgreSQL, you can roll an
extension out much quicker than you can get a change upstreamed and
released. So I think EXPLAIN extensibility is, as a general concept,
useful.

So here are some patches.

0001 allows a loadable module to register new EXPLAIN options.
Currently, EXPLAIN (FUNGUS) will error out, but if you want to make it
work, this patch is for you. This patch also allows you to stash some
state related to your new option, or options, in the ExplainState.
Core options have hard-coded structure members; e.g. EXPLAIN (BUFFERS)
sets es->buffers. If you add EXPLAIN (FUNGUS), there won't be an
es->fungus, but you can get about the same effect using the new
facilities provided here.

0002 provides hooks that you can use to make your new EXPLAIN options
actually do something. In particular, this adds a new hook that is
called once per PlanState node, and a new nook that is called once per
PlannedStmt. Each is called at an appropriate point for you to tack on
more output after what EXPLAIN would already produce.

0003 adds a new contrib module called pg_overexplain, which adds
EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE). I actually think this is
quite useful for planner hacking, and maybe a few more options would
be, too. Right now, if you want to see stuff that EXPLAIN doesn't
clearly show, you have to use SET debug_print_plan = true, and that
output is so verbose that finding the parts you actually want to see
is quite difficult. Assuming it gives you the details you need,
EXPLAIN (RANGE_TABLE) looks way, way better to me, and if we end up
committing these patches I anticipate using this semi-regularly.

There are plenty of debatable things in this patch set, and I mention
some of them in the commit messages. The hook design in 0002 is a bit
simplistic and could be made more complex; there's lots of stuff that
could be added to or removed from 0003, much of which comes down to
what somebody hacking on the planner would actually want to see. I'm
happy to bikeshed all of that stuff; this is all quite preliminary and
I'm not committed to the details. The only thing that would disappoint
me is if somebody said "this whole idea of making EXPLAIN extensible
is stupid and pointless and we shouldn't ever do it." I will argue
against that vociferously. I think even what I have here is enough to
disprove that hypothesis, but I have a bunch of ideas about how to do
more. Some of those require additional infrastructure and are best
proposed with that other infrastructure; some can be done with just
this, but I ran out of time to code up examples so here is what I have
got so far.

Hope you like it, sorry if you don't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: making EXPLAIN extensible

From
Thom Brown
Date:
On Fri, 28 Feb 2025 at 19:26, Robert Haas <robertmhaas@gmail.com> wrote:
>
> Prior to PostgreSQL 10, EXPLAIN had just 2 options: VACUUM and
> ANALYZE. Now, we're up to 12 options, which is already quite a lot,
> and there's plenty more things that somebody might like to do.
> However, not all of those things necessarily need to be part of the
> core code. My original reason for wanting to extend EXPLAIN was that I
> was thinking about an extension that would want to do a bunch of
> things and one of those things would be to add some information to the
> EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
> option whose whole purpose is to cater to the needs of some extension,
> so that made me think of providing some extensibility infrastructure.
>
> However, there are other use cases, too, basically any of the normal
> reasons why extensibility is useful and desirable. You might need to
> get some information out a query plan that 99% of people don't care
> about. You could come up with your own way of formatting a query plan,
> but that's a big pain. It's a lot nicer if you can just add the detail
> that you care about to the EXPLAIN output without needing to modify
> PostgreSQL itself. Even if you think of something that really ought to
> be included in the EXPLAIN output by PostgreSQL, you can roll an
> extension out much quicker than you can get a change upstreamed and
> released. So I think EXPLAIN extensibility is, as a general concept,
> useful.
>
> So here are some patches.
>
> 0001 allows a loadable module to register new EXPLAIN options.
> Currently, EXPLAIN (FUNGUS) will error out, but if you want to make it
> work, this patch is for you. This patch also allows you to stash some
> state related to your new option, or options, in the ExplainState.
> Core options have hard-coded structure members; e.g. EXPLAIN (BUFFERS)
> sets es->buffers. If you add EXPLAIN (FUNGUS), there won't be an
> es->fungus, but you can get about the same effect using the new
> facilities provided here.
>
> 0002 provides hooks that you can use to make your new EXPLAIN options
> actually do something. In particular, this adds a new hook that is
> called once per PlanState node, and a new nook that is called once per
> PlannedStmt. Each is called at an appropriate point for you to tack on
> more output after what EXPLAIN would already produce.
>
> 0003 adds a new contrib module called pg_overexplain, which adds
> EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE). I actually think this is
> quite useful for planner hacking, and maybe a few more options would
> be, too. Right now, if you want to see stuff that EXPLAIN doesn't
> clearly show, you have to use SET debug_print_plan = true, and that
> output is so verbose that finding the parts you actually want to see
> is quite difficult. Assuming it gives you the details you need,
> EXPLAIN (RANGE_TABLE) looks way, way better to me, and if we end up
> committing these patches I anticipate using this semi-regularly.
>
> There are plenty of debatable things in this patch set, and I mention
> some of them in the commit messages. The hook design in 0002 is a bit
> simplistic and could be made more complex; there's lots of stuff that
> could be added to or removed from 0003, much of which comes down to
> what somebody hacking on the planner would actually want to see. I'm
> happy to bikeshed all of that stuff; this is all quite preliminary and
> I'm not committed to the details. The only thing that would disappoint
> me is if somebody said "this whole idea of making EXPLAIN extensible
> is stupid and pointless and we shouldn't ever do it." I will argue
> against that vociferously. I think even what I have here is enough to
> disprove that hypothesis, but I have a bunch of ideas about how to do
> more. Some of those require additional infrastructure and are best
> proposed with that other infrastructure; some can be done with just
> this, but I ran out of time to code up examples so here is what I have
> got so far.
>
> Hope you like it, sorry if you don't.

"pg_overexplain"? I love this name! And the idea sounds like a natural
evolution, so +1.

Some questions:

One thing I am wondering is whether extensions should be required to
prefix their EXPLAIN option with the extension name to avoid
collisions.

If two extensions happen to choose the same name, it won't be possible
to use both simultaneously.

In what order would the options be applied? Would it be deterministic,
or weighted within the extension's configuration, or based on the
order of the options in the list?

Would explain extensions be capable of modifying pre-existing core
option output, or just append to output?

Should there be a way of determining which lines are output by which
option? An extension may output similar output to core output, making
it difficult or impossible to discern which is which.

Does there need to be any security considerations so that things like
RLS don't inadvertently become leaky?

Thom



Re: making EXPLAIN extensible

From
Isaac Morland
Date:
On Fri, 28 Feb 2025 at 15:09, Thom Brown <thom@linux.com> wrote:
 
One thing I am wondering is whether extensions should be required to
prefix their EXPLAIN option with the extension name to avoid
collisions.

If two extensions happen to choose the same name, it won't be possible
to use both simultaneously.

Could the call that processes the registration automatically prepend the extension name to the supplied explain option name? So if extension X registers option O it would be registered as X_O rather than returning an error if O doesn't follow the proper pattern.

Re: making EXPLAIN extensible

From
Robert Haas
Date:
On Fri, Feb 28, 2025 at 3:09 PM Thom Brown <thom@linux.com> wrote:
> "pg_overexplain"? I love this name! And the idea sounds like a natural
> evolution, so +1.

Thanks. I thought about things like pg_hyperexplain or
pg_explain_debug, but in the end I didn't like any of them better than
overexplain. :-)

> One thing I am wondering is whether extensions should be required to
> prefix their EXPLAIN option with the extension name to avoid
> collisions.

I considered that. One advantage of doing that is that you could
support autoloading. Right now, you have to LOAD 'pg_overexplain' or
put it in session_preload_libraries or shared_preload_libraries in
order to use it. If you required people to type EXPLAIN
(pg_overexplain.range_table) instead of just EXPLAIN (range_table),
then you could react to not finding any such option by trying to
autoload a .so with the part of the name before the dot.

But you can probably see that this idea has a couple of pretty serious
weaknesses:

1. It is much more verbose. I theorize that people will be unhappy
about having to type EXPLAIN (pg_overexplain.range_table) rather than
just EXPLAIN (range_table). One could try to address this by renaming
the extension to something shorter, like just 'oe'. Having to type
EXPLAIN (oe.range_table) wouldn't be nearly as annoying. However, this
seems like a pretty clear case of letting the tail wag the dog.

2. autoloading could have security concerns. This is probably fixable,
but we'd need to be sure that providing a new way to trigger loading a
module didn't open up any security holes.

> If two extensions happen to choose the same name, it won't be possible
> to use both simultaneously.

That's true. Of course, a lot depends on whether we end up with 3 or 5
or 8 EXPLAIN extensions or more like 30 or 50 or 80. In the former
case, the people writing those extensions will probably mostly know
about each other and can just use different names. In the latter case
it's a problem. My guess is it's the former case.

> In what order would the options be applied? Would it be deterministic,
> or weighted within the extension's configuration, or based on the
> order of the options in the list?

I'm not entirely sure I know which question you're asking here. If
you're asking what happens if two modules try to register the same
EXPLAIN option name and then a user uses it, one of the registrations
will win and the other will lose. I think the second one wins. As I
say above, I assume we'll find a way to not try to do that. However, I
think more likely you're asking: if you load pg_fingernailexplain and
pg_toenailexplain and then do EXPLAIN (toenail, fingernail) SELECT
..., in what order will the options take effect? For the answer to
that question, see the commit message for 0002.

> Would explain extensions be capable of modifying pre-existing core
> option output, or just append to output?

The interfaces we have are really only going to work for appending.
Modifying would be cool, but I think it's mostly impractical. We have
a framework for emitting stuff into EXPLAIN output in a way that takes
into account whether you're in text mode or json or yaml or whatever,
and this patch just builds on that existing framework to allow you to
make extra calls to those emit-some-output functions at useful places.
As a result, the patch is small and simple. If we had an existing
framework for modifying stuff, then we could perhaps provide suitable
places to call those functions, too. But they don't exist, and it's
not easy to see how they could be created. I think you would need some
kind of major redesign of explain.c, and I don't know how to do that
without making it bloated, slow, and unmaintainable.

If somebody comes up with a way of allowing certain limited types of
modifications to EXPLAIN output with small, elegant-looking code
changes, and if those changes seem like useful things for an extension
to want to do, I'm totally on board. But I currently don't have an
idea like that.

> Should there be a way of determining which lines are output by which
> option? An extension may output similar output to core output, making
> it difficult or impossible to discern which is which.

I don't think this is really going to be a problem.

> Does there need to be any security considerations so that things like
> RLS don't inadvertently become leaky?

It's possible that there may be some security considerations, and
that's worth thinking about. However, RLS disclaims support for
side-channel attacks, so it's already understood to be (very) leaky.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: making EXPLAIN extensible

From
Sami Imseih
Date:
> EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
> option whose whole purpose is to cater to the needs of some extension,
> so that made me think of providing some extensibility infrastructure.

Making EXPLAIN extensible sounds like a good idea.. FWIW, There is a
discussion [0]
for showing FDW remote plans ( postgres_fdw specifically), and I think
we  will need to
add some new options to EXPLAIN to make that possible.

Have not looked at your patches, but I will do so now.


Regards,

Sami Imseih
Amazon Web Services (AWS)


[0] https://www.postgresql.org/message-id/CAA5RZ0vXiOiodrNQ-Va4FCAkXMpGA%3DGZDeKjFBRgRvHGuW7s7Q%40mail.gmail.com