Re: pg_plan_advice - Mailing list pgsql-hackers
| From | Amit Langote |
|---|---|
| Subject | Re: pg_plan_advice |
| Date | |
| Msg-id | CA+HiwqEb8ZUfq92TMBNUtetfMV=50mHGTNrt8kOkoE9m_aqZvw@mail.gmail.com Whole thread Raw |
| In response to | pg_plan_advice (Robert Haas <robertmhaas@gmail.com>) |
| Responses |
Re: pg_plan_advice
|
| List | pgsql-hackers |
Hi Robert, On Thu, Oct 30, 2025 at 11:00 PM Robert Haas <robertmhaas@gmail.com> wrote: > As I have mentioned on previous threads, for the past while I have > been working on planner extensibility. I've posted some extensibility > patches previously, and got a few of them committed in > Sepember/October with Tom's help, but I think the time has come a > patch which actually makes use of that infrastructure as well as some > further infrastructure that I'm also including in this posting.[1] The > final patch in this series adds a new contrib module called > pg_plan_advice. Very briefly, what pg_plan_advice knows how to do is > process a plan and emits a (potentially long) long text string in a > special-purpose mini-language that describes a bunch of key planning > decisions, such as the join order, selected join methods, types of > scans used to access individual tables, and where and how > partitionwise join and parallelism were used. You can then set > pg_plan_advice.advice to that string to get a future attempt to plan > the same query to reproduce those decisions, or (maybe a better idea) > you can trim that string down to constrain some decisions (e.g. the > join order) but not others (e.g. the join methods), or (if you want to > make your life more exciting) you can edit that advice string and > thereby attempt to coerce the planner into planning the query the way > you think best. There is a README that explains the design philosophy > and thinking in a lot more detail, which is a good place to start if > you're curious, and I implore you to read it if you're interested, and > *especially* if you're thinking of flaming me. Thanks for posting this. Looks very interesting to me. These are just high-level comments after browsing the patches and reading some bits like pgpa_identifier to get myself familiarized with the project. I like that the key concept here is plan stability rather than plan control, because that framing makes it easier to treat this as infrastructure instead of policy. > I want to mention that, beyond the fact that I'm sure some people will > want to use something like this (with more feature and a lot fewer > bugs) in production, it seems to be super-useful for testing. We have > a lot of regression test cases that try to coerce the planner to do a > particular thing by manipulating enable_* GUCs, and I've spent a lot > of time trying to do similar things by hand, either for regression > test coverage or just private testing. This facility, even with all of > the bugs and limitations that it currently has, is exponentially more > powerful than frobbing enable_* GUCs. Once you get the hang of the > advice mini-language, you can very quickly experiment with all sorts > of plan shapes in ways that are currently very hard to do, and thereby > find out how expensive the planner thinks those things are and which > ones it thinks are even legal. So I see this as not only something > that people might find useful for in production deployments, but also > something that can potentially be really useful to advance PostgreSQL > development. +1, the testing benefits make this worthwhile. > Which brings me to the question of where this code ought to go if it > goes anywhere at all. I decided to propose pg_plan_advice as a contrib > module rather than a part of core because I had to make a WHOLE lot of > opinionated design decisions just to get to the point of having > something that I could post and hopefully get feedback on. I figured > that all of those opinionated decisions would be a bit less > unpalatable if they were mostly encapsulated in a contrib module, with > the potential for some future patch author to write a different > contrib module that adopted different solutions to all of those > problems. But what I've also come to realize is that there's so much > infrastructure here that leaving the next person to reinvent it may > not be all that appealing. Query jumbling is a previous case where we > initially thought that different people might want to do different > things, but eventually realized that most people really just wanted > some solution that they didn't have to think too hard about. Likewise, > in this patch, the relation identifier system described in the README > is the only thing of its kind, to my knowledge, and any system that > wants to accomplish something similar to what pg_plan_advice does > would need a system like that. pg_hint_plan doesn't have something > like that, because pg_hint_plan is just trying to do hints. This is > trying to do round-trip-safe plan stability, where the system will > tell you how to refer unambiguously to a certain part of the query in > a way that will work correctly on every single query regardless of how > it's structured or how many times it refers to the same tables or to > different tables using the same aliases. If we say that we're never > going to put any of that infrastructure in core, then anyone who wants > to write a module to control the planner is going to need to start by > either (a) reinventing something similar, (b) cloning all the relevant > code, or (c) just giving up on the idea of unambiguous references to > parts of a query. None of those seem like great options, so now I'm > less sure whether contrib is actually the right place for this code, > but that's where I have put it for now. Feedback welcome, on this and > everything else. On the relation identifier system: IMHO this part doesn't seem as opinionated as the advice mini-language. The requirements pretty much dictate the design -- you need alias names and occurrence counters to handle self-joins, partition fields for partitioned tables, and a string representation to survive dump/restore. There doesn't seem to be much flexibility in that. Given that, it seems more practical to put this in core from the start. Extensions that might want to build plan-advice-like functionality shouldn’t have to clone this logic and wait another release for something that’s already well-defined and deterministic. The mini-language is opinionated and belongs in contrib, but the identifier infrastructure just solves a fundamental problem cleanly. On the infrastructure patches (0001-0005): these look sensible. The range table flattening info, elided node tracking, and append node consolidation preserve information that's currently lost -- there's some additional overhead to track this, but it's fixed per-relation per-subquery, which seems reasonable. The path generation hooks (0005) are a clear improvement: moving from global enable_* GUCs to per-RelOptInfo pgs_mask gives extensions the granularity they need for relation-specific and join-specific decisions. Yes, you need C code to use them, but you'd need to write C code to do something of value in this area anyway, and the hooks give you control that GUCs can't provide. Overall, I'm supportive of getting these committed once they're ready. contrib/pg_plan_advice is a compelling proof-of-concept for why these hooks are needed. I'll try to post more specific comments once I've read this some more. -- Thanks, Amit Langote
pgsql-hackers by date: