Re: pg_plan_advice - Mailing list pgsql-hackers
| From | Robert Haas |
|---|---|
| Subject | Re: pg_plan_advice |
| Date | |
| Msg-id | CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com Whole thread Raw |
| In response to | Re: pg_plan_advice (Robert Haas <robertmhaas@gmail.com>) |
| Responses |
Re: pg_plan_advice
|
| List | pgsql-hackers |
Here is a new patch set (v14). I finally got time to implement an idea I've been thinking about for a while, which is to run the regression tests planning every query twice. The first time, we generate plan advice. The second time, we replan using that advice to guide planning. If everything in the world is wonderful, this shouldn't cause the regression tests to fail. Of course, it did, so this patch set includes a bunch of changes to both make this kind of testing possible and to fix the issues thus revealed. Here's a rundown of the problems I discovered, some of which were problems in pg_plan_advice, some of which are problems created by my recently committed patches, and a few of which are long-standing planner problems that just haven't been very apparent up until now. 0. I couldn't actually write the test code to do the above because planner_setup_hook doesn't get cursorOptions as an argument, so having the planner setup hook call back into planner() wasn't easy to do correctly. I've added a new preparatory patch (0002) to fix it. I plan to proceed with this patch quickly unless there are objections, since planner_setup_hook is brand new and there seems to be no downside to fixing this quickly. 1. A bunch of tests failed because they use debug_parallel_query=on. That introduced a single-copy Gather node into the plan, which caused pg_plan_advice to emit GATHER() advice. When the queries are then replanned, they use a real Gather node, not a single-copy one. I've updated pg_plan_advice to disregard single-copy Gather nodes. 2. A bunch of tests failed because pg_plan_advice was assuming, in some places, that it was the only thing adjusting pgs_mask. But, a lot of our regression tests run under enable_SOMETHING=false. I adjusted pg_plan_advice so that it only ever *clears* bits from pgs_mask instead of sometimes being willing to set them. This means that the effects of enable_SOMETHING=false and pg_plan_advice are additive, rather than (as in previous versions) letting pg_plan_advice sometimes override the GUC value for no particular reason. 3. One test failed because cost_material() tries to check PGS_CONSIDER_NONPARTIAL instead of relying on the caller to set the "enabled" flag properly. See http://postgr.es/m/CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com or patch 0001 for details and a fix. I plan to commit this one relatively quickly, too, barring objections. 4. One test failed because setting enable_indexonlyscan=false, or clearing the equivalent pgs_mask bit, can cause a Bitmap Heap Scan not to be used. I worked around this in pg_plan_advice, but I think we should consider doing something about it in the core planner. See http://postgr.es/m/CA+TgmoZJAH4vA0q3T0t+CnuCXvhXvg+ZcBqs8s_LUJ0D12QhBA@mail.gmail.com for all the details of what's happening here. 5. One test failed because pg_plan_advice was forcing the use of a particular index by deleting all the others from rel->indexlist in its get_relation_info_hook. This is currently the only method we have for a loadable module to force the use of an index, but the problem is that it makes the index invisible for all purposes, and this particular test relies on self-join elimination. The query can't try to force the use of index #1 without hiding the existence of index #2 from the SJE code, and then we're in trouble. I've also verified that deleting elements from the index list can break left join removal. So I think we need a way to tell the planner that we don't want a certain index to be scanned without telling it that the index doesn't exist at all. Therefore, I've revived my previous patch to add a 'disabled' flag to IndexOptInfo. At the time I last proposed that, Tom said it wasn't needed because we could just delete from the index list, and I didn't have any reason why we couldn't just do that, so I made pg_plan_advice work that way. But now I do have a reason, so brought that patch back as 0006 and adjusted pg_plan_advice to use it. 6. One tests failed because add_partial_path() doesn't consider startup cost as a figure of merit. There's a comment there written by me at the time I was introducing parallel query, back in 2016, which claims it isn't necessary, but by 2020, while looking at incremental sort, James Coleman had figured out that this was incorrect. In brief, what happens here is that the query plan doesn't use a sequential scan, but disabling sequential scans changes the final plan, and in fact the new plan has a lower cost than the old one. For full details and proposed patches, see http://postgr.es/m/CA+TgmobRufbUSksBoxytGJS1P+mQY4rWctCk-d0iAUO6-k9Wrg@mail.gmail.com -- I've also incorporated those patches into this patch set as 0008 and 0009, so that the new testing framework added in 0010 will actually pass. So here's an overview of the new patch set. 0001. New patch: fix PGS_CONSIDER_NONPARTIAL interaction with Materialize nodes. See point (3) above. 0002. New patch: pass cursorOptions to planner_setup_hook. See point (0) above. 0003-0005. These were 0001-0003 in the previous patch set, and I have included the changes suggested by Alexandra Wang in her review. 0006. Revived patch from an old thread: Allow extensions to mark an individual index as disabled. See point (5) above. 0007. pg_plan_advice. This was 0004 in the previous patch set. This has a few updates: - It has been adjusted for the new 0001, 0002, and 0006 patches. - The issues mentioned in points (1), (2), and (4) above have been fixed. - There's now a system for other plugins to add "advisors", which is a way for that other module to supply an advice string that can be used during query planning. Here, I've done that to enable the testing described above, but it could also be used by a module that wants to do on-the-fly advice injection. - Minor cosmetic improvements, mostly reordering some functions in pgpa_planner.c for (IMHO) better clarity. 0008-0009. My proposed patches to address add_partial_path's failure to consider startup costs and related bugs I found while investigating. See point (6) above. 0010. New patch: add src/test/modules/test_plan_advice, implementing the testing procedure described above. I think there's quite a bit more that can be done with this new testing methodology to find additional issues here, and I plan to pursue that. But I'm fairly happy with the results of this initial round of testing. On the other hand, it flushed out a bunch of issues that seem worth fixing. But it also didn't break a hundred things for a hundred different reasons, which seems good, too. -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
- v14-0004-Store-information-about-elided-nodes-in-the-fina.patch
- v14-0001-Fix-PGS_CONSIDER_NONPARTIAL-interaction-with-Mat.patch
- v14-0003-Store-information-about-range-table-flattening-i.patch
- v14-0002-Pass-cursorOptions-to-planner_setup_hook.patch
- v14-0005-Store-information-about-Append-node-consolidatio.patch
- v14-0010-Test-pg_plan_advice-using-a-new-test_plan_advice.patch
- v14-0008-Fix-add_partial_path-interaction-with-disabled_n.patch
- v14-0009-Consider-startup-cost-as-a-figure-of-merit-for-p.patch
- v14-0006-Allow-extensions-to-mark-an-individual-index-as-.patch
- v14-0007-Add-pg_plan_advice-contrib-module.patch
pgsql-hackers by date: