Re: pg_plan_advice - Mailing list pgsql-hackers
| From | Lukas Fittl |
|---|---|
| Subject | Re: pg_plan_advice |
| Date | |
| Msg-id | CAP53PkzKeD=t90OfeMsniYrcRe2THQbUx3g6wV17Y=ZtiwmWTQ@mail.gmail.com Whole thread |
| In response to | Re: pg_plan_advice (Robert Haas <robertmhaas@gmail.com>) |
| Responses |
Re: pg_plan_advice
|
| List | pgsql-hackers |
Hi Robert, On Thu, Mar 12, 2026 at 10:15 AM Robert Haas <robertmhaas@gmail.com> wrote: > I've now committed the main patch. I think that I'm not likely to get > too much more feedback on that in this release cycle, and I judge that > more people will be sad if it doesn't get shipped this cycle than if > it does. That might turn out to be catastrophically wrong, but if many > problem reports quickly begin to emerge, it can always be reverted. I think its good to get this in, and not do it in the last minute, but just to be clear on my part, I've reviewed earlier versions, but I have not reviewed the latest code to the extent I would have liked before it was committed, due to competing priorities on my end. That said, its a good forcing function, so let me do my part to help shake out any bugs that remain. From a code review perspective, I found some minor issues: - Identifiers that are named like advice types cause an error (so I can't name my table "hash_join") - pgpa_destroy_join_unroller does not free the inner_unrollers values (which I think goes against the intent of cleaning up the allocations right away) - pgpa_parser uses pgpa_yyerror, but that function doesn't use elog(ERROR) like the main parser does - I think it'd be more consistent to use YYABORT explicitly, e.g. how we do it in cubeparse - pgpa_scanner accepts integers with underscores, but incorrectly uses a simple strtoint on them (which would fail), instead of pg_strtoint32 / pg_strtoint32_safe - pgpa_walk_recursively uses "0" for the boolean value being passed to a recursive call in one case (should be "false") Attached nocfbot-0001 that addresses these. From a usability perspective, I do wonder about two things when it comes to users specifying advice directly (which will happen in practice, e.g. when debugging plan problems): 1) The lack of being able to target scan advice (e.g. SEQ_SCAN) to a partition parent is frustrating - I believe we discussed partitioning earlier in the thread in terms of gathering+applying it, but I do wonder if we shouldn't at least allow users to specify a partitioned table/partitioned index, instead of only the children. See attached nocfbot-0002 for an idea what would be enough, I think. 2) I find the join strategy advice hard to use - pg_hint_plan has hints (e.g. HashJoin) that allow saying "use a hash join when joining these two rels in any order", vs pg_plan_advice requires setting a specific outer rel, which only makes sense when you want to fully specify every aspect of the plan. I suspect users who directly write advice will struggle with specifying join strategy advice as it is right now. We could consider having a different syntax for saying "I want a hash join when these two rels are joined, but I don't care about the order", e.g. "USE_HASH_JOIN(a b)". If you think that's worthwhile I'd be happy to take a stab at a patch. > I'm still hoping to get some more feedback on the remaining patches, > which are much smaller and conceptually simpler. While there is no > time to redesign them at this point in the release cycle, there is > still the opportunity to fix bugs, or decide that they're too > half-baked to ship. So here is v20 with just those patches. Of course, > post-commit review of the main patch is also very welcome. For v20-0001, from a quick conceptual review: I find the two separate GUC mechanisms for local backend vs shared memory a bit confusing as a user (which one should I be using?). Enabling the shared memory mechanism on a system-wide basis seems like it'd likely have too high overhead anyway for production systems? (specifically worried about the advice capturing for each plan, though I haven't benchmarked it) I wonder if we shouldn't keep this simpler for now, and e.g. only do the backend local version to start - we could iterate a bit on system-wide collection out-of-core, e.g. I'm considering teaching pg_stat_plans to optionally collect plan advice the first time it sees a plan ID (plan advice is roughly a superset of what we consider a plan ID there), and then we could come back to this for PG20. For v20-0002: I think we need a form of this in tree, because otherwise pg_plan_advice breaks as planner logic changes. This of course puts a burden on other hackers making changes, but I think that's what we're signing up for with having pg_plan_advice in contrib - and maybe we're overestimating how often that'll actually be a problem. To help assess impact, I did a quick test run and looked at three not-yet-committed patches in the commitfest that affect planner logic ([0], [1] and [2]), to see if they'd require pg_plan_advice changes (master with v20-0002 applied). Maybe I picked the wrong patches, but at least with those no pg_plan_advice changes were needed with the test_plan_advice test enabled. On the code itself: - Is there a reason you're setting "pg_plan_advice.always_store_advice_details" to true, instead of using pg_plan_advice_request_advice_generation? - I wonder if we could somehow detect advice not matching? Right now that'd be silently ignored, i.e. you'd only get a test failure when we generate the wrong advice that causes a plan change in the regression tests. For v20-0003, initial thoughts: I think getting at least a basic version of this in would be good, as a server-wide way to set advice for queries can help people get out of a problem when Postgres behaves badly - and we know from pg_hint_plan (which has a hint table) that this can be useful even without doing any kind of parameter sniffing/etc to be smart about different parameters for the same query. The name "stash" feels a bit confusing as an end-user facing term. Maybe something like "pg_apply_advice", or "pg_query_advice" would be better? (though I kind of wish we could tie it more closely to "plan advice", but e.g. "pg_plan_advice_apply" feels too lengthy) --- Thanks, Lukas [0]: https://commitfest.postgresql.org/patch/5487/ ("Pull-up subquery if INNER JOIN-ON contains refs to upper-query") [1]: https://commitfest.postgresql.org/patch/5738/ ("Improve hash join's handling of tuples with null join keys") [2]: https://commitfest.postgresql.org/patch/6315/ ("Enable partitionwise join for partition keys wrapped by RelabelType") -- Lukas Fittl
Attachment
pgsql-hackers by date: