Thread: Re: plan shape work

Re: plan shape work

From
Tomas Vondra
Date:
On 5/19/25 20:01, Robert Haas wrote:
> Hi,
> 
> A couple of people at pgconf.dev seemed to want to know more about my
> ongoing plan shape work, so here are the patches I have currently.
> This is a long way from something that actually looks like a usable
> feature, but these are bits of infrastructure that I think will be
> necessary to get to a usable feature. As a recap, my overall goal here
> is to make it so that you can examine a finished plan, figure out what
> decisions the planner made, and then somehow get the planner to make
> those same decisions over again in a future planning cycle. Since
> doing this for all types of planner decisions seems too difficult for
> an initial goal, I'm focusing on scans and joins for now. A further
> goal is that I want it to be possible for extensions to use this
> infrastructure to implement a variety of different policies that they
> might feel to be beneficial, so I'm looking to minimize the amount of
> stuff that has to be done in core PostgreSQL or can only be used by
> core PostgreSQL.
> 
> ...

Thanks for the overview. I don't have any immediate feedback, but it
sounds like it might be related to the "making planner decisions clear"
session from the unconference ...

The basic premise of that session was about how to give users better
info about the planner decisions - why paths were selected/rejected,
etc. A simple example would be "why was the index not used", and the
possible answers include "dominated by cost by another path" or "does
not match the index keys" etc.

I wonder if this work might be useful for something like that.


regards

-- 
Tomas Vondra




Re: plan shape work

From
Maciek Sakrejda
Date:
On Wed, May 21, 2025 at 7:29 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, May 20, 2025 at 3:09 PM Maciek Sakrejda <maciek@pganalyze.com> wrote:
> > +1, this seems like it could be very useful. A somewhat related issue
> > is being able to tie plan nodes back to the query text: it can be hard
> > to understand the planner's decisions if it's not even clear what part
> > of the query it's making decisions about. I'm sure this is not an easy
> > problem in general, but I wonder if you think that could be improved
> > in the course of this work, or if you have other thoughts about it.
>
> Thanks. I don't really have any ideas about the problem you mention,
> perhaps partly because I haven't experienced it too much.

That may be due to your extensive experience with Postgres and EXPLAIN plans.

> I mean, I
> have sometimes been confused about which parts of the query go with
> which parts of the EXPLAIN, but I think in my experience so far that
> is mostly because either (1) both the query and the EXPLAIN output are
> super long and maybe also super-wide and therefore it's hard to
> correlate things by eye or (2) somebody wrote a query where they use
> the same table and/or table alias over and over again in different
> parts of the query and so it's hard to tell which reference goes with
> which. Neither of those problems seems all that exciting to me from a
> dev perspective: if you're calling everything a or x or orders or
> something, maybe don't do that, and if your query is 1500 characters
> long, I guess you need to budget some time to align that with the
> query plan.

Fair enough, although the people trying to make sense of EXPLAIN plans
are sometimes not the same ones who are writing the queries. And
sometimes the queries are not written by people at all but by ORMs
(or—heaven help us—vibe coded). "Don't do X" is a reasonable response
to "It hurts when I do X," but it doesn't really solve the user's
problem. That said, it's hard to argue with "We don't have any good
ideas on how to improve this right now, and it's not a total dumpster
fire, so we'll focus on other work."

> I don't really know how much we can do here. But maybe
> there are cases that I haven't seen where something better is
> possible, or perhaps you have some good idea that I haven't
> considered.

No great ideas here. I thought initially that a good solution would be
to have structured EXPLAIN output include something like "Query Text
Start Index" and "Query Text End Index" fields for each node, but I
realized that this doesn't really work for multiple joins (and
probably other cases). Maybe "Query Text Indices", as a list of pairs?
But from the little I know about the planner, that seems like any sort
of tracking back to the source would be hard to implement. And it only
really solves the problem for external EXPLAIN viewers, and only ones
that put in the work to support this. I'm not sure if the problem can
be meaningfully addressed for text format, but maybe that's another
reason not to spend time on it in core.

> (If I'm honest, I do have an idea that I think might very
> significantly improve the readability of EXPLAIN output. I think it
> would make it much less wide in normal cases without making it much
> longer. This has been percolating in my brain for a few years now and
> I have the vague intention of proposing it at some point, but not
> until I'm good and ready to be flamed to a well-done crisp, because
> I'm quite sure there will be more than one opinion on the merits.)

I'm intrigued, and happy to stand by with an extinguisher. The road to
great ideas is paved with bad ideas.

Thanks,
Maciek