Re: plan shape work - Mailing list pgsql-hackers
From | Alexandra Wang |
---|---|
Subject | Re: plan shape work |
Date | |
Msg-id | CAK98qZ1i8HbK+2R6=FQWiYwooJCvXQ9Tiekc6VCY08j94AppRw@mail.gmail.com Whole thread Raw |
In response to | Re: plan shape work (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
Hi Robert,
On Tue, Aug 26, 2025 at 7:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, May 19, 2025 at 2:01 PM Robert Haas <robertmhaas@gmail.com> wrote:
> A couple of people at pgconf.dev seemed to want to know more about my
> ongoing plan shape work, so here are the patches I have currently.
Here's an updated patch set. My goal for the September CommitFest is
to get patches 0001-0004 committed. Of course, if there are too many
objections or too little review, that might not happen, but that's my
goal.
Thanks for the patches!
I don’t know enough about the history in this area to object to your
approach or suggest an alternative design. That said, I’ve reviewed
patches 0001-0004, and as individual patches they make sense to me.
Below are some more detailed comments, which would only be relevant if
you decide to proceed in this direction.
I don’t know enough about the history in this area to object to your
approach or suggest an alternative design. That said, I’ve reviewed
patches 0001-0004, and as individual patches they make sense to me.
Below are some more detailed comments, which would only be relevant if
you decide to proceed in this direction.
0002:
QUERY PLAN
----------------------------------------------------------
Nested Loop Left Join
- Output: t1.i, (1), t2.i2, i3, t4.i4
+ Output: t1.i, (1), t2.i2, t3.i3, t4.i4
-> Nested Loop Left Join
- Output: t1.i, t2.i2, (1), i3
+ Output: t1.i, t2.i2, (1), t3.i3
Join Filter: false
-> Hash Left Join
Output: t1.i, t2.i2, (1)
----------------------------------------------------------
Nested Loop Left Join
- Output: t1.i, (1), t2.i2, i3, t4.i4
+ Output: t1.i, (1), t2.i2, t3.i3, t4.i4
-> Nested Loop Left Join
- Output: t1.i, t2.i2, (1), i3
+ Output: t1.i, t2.i2, (1), t3.i3
Join Filter: false
-> Hash Left Join
Output: t1.i, t2.i2, (1)
These plan changes introduced by 0002, which adds schema qualifiers,
make me very happy. I think it’s a nice improvement on its own.
make me very happy. I think it’s a nice improvement on its own.
In reply to 0002's commit message:
> XXX. I have broken this out as a separate commit for now; however,
> it could be merged with the commit to add 'relids' to 'Result'; or
> the patch series could even be rejiggered to present this as the
> primary benefit of that change, leaving the EXPLAIN changes as a
> secondary benefit, instead of the current organization, which does
> the reverse.
> it could be merged with the commit to add 'relids' to 'Result'; or
> the patch series could even be rejiggered to present this as the
> primary benefit of that change, leaving the EXPLAIN changes as a
> secondary benefit, instead of the current organization, which does
> the reverse.
I’m fine with the current organization. I agree that if we just
compare the EXPLAIN changes in 0001, which add additional “Replaces:”
information for the simple Result node, with the EXPLAIN changes in
0002, the changes in 0002 are arguably more attractive. However, I
think the EXPLAIN changes in 0001 are a more direct reflection of what
the rest of 0001 is trying to achieve: keeping track of the RTIs a
Result node is scanning. The changes in 0002 feel more like a side
benefit.
compare the EXPLAIN changes in 0001, which add additional “Replaces:”
information for the simple Result node, with the EXPLAIN changes in
0002, the changes in 0002 are arguably more attractive. However, I
think the EXPLAIN changes in 0001 are a more direct reflection of what
the rest of 0001 is trying to achieve: keeping track of the RTIs a
Result node is scanning. The changes in 0002 feel more like a side
benefit.
With that said, this is just my personal preference. If other
reviewers feel differently, I won’t object.
reviewers feel differently, I won’t object.
0003:
In get_scanned_rtindexes():
+ case T_NestLoop:
+ {
+ Bitmapset *outer_scanrelids;
+ Bitmapset *inner_scanrelids;
+ Bitmapset *combined_scanrelids;
+
+ outer_scanrelids =
+ get_scanned_rtindexes(root, plan->lefttree);
+ inner_scanrelids =
+ get_scanned_rtindexes(root, plan->righttree);
+ combined_scanrelids =
+ bms_union(outer_scanrelids, inner_scanrelids);
+ inner_scanrelids = remove_join_rtis(root, inner_scanrelids);
+
+ return remove_join_rtis(root, combined_scanrelids);
+ break;
+ }
+ {
+ Bitmapset *outer_scanrelids;
+ Bitmapset *inner_scanrelids;
+ Bitmapset *combined_scanrelids;
+
+ outer_scanrelids =
+ get_scanned_rtindexes(root, plan->lefttree);
+ inner_scanrelids =
+ get_scanned_rtindexes(root, plan->righttree);
+ combined_scanrelids =
+ bms_union(outer_scanrelids, inner_scanrelids);
+ inner_scanrelids = remove_join_rtis(root, inner_scanrelids);
+
+ return remove_join_rtis(root, combined_scanrelids);
+ break;
+ }
It looks like there is an redundant assignment to inner_scanrelids:
+ inner_scanrelids = remove_join_rtis(root, inner_scanrelids);
0004:
There is a compiler warning reported in the CommitFest build:
https://cirrus-ci.com/task/6248981396717568
https://cirrus-ci.com/task/6248981396717568
[23:03:57.811] subselect.c:3232:1: error: control reaches end of non-void function [-Werror=return-type]
[23:03:57.811] 3232 | }
[23:03:57.811] | ^
[23:03:57.811] cc1: all warnings being treated as errors
[23:03:57.812] make[4]: *** [<builtin>: subselect.o] Error 1
[23:03:57.812] make[3]: *** [../../../src/backend/common.mk:37: plan-recursive] Error 2
[23:03:57.812] make[2]: *** [common.mk:37: optimizer-recursive] Error 2
[23:03:57.812] make[2]: *** Waiting for unfinished jobs....
[23:04:05.513] make[1]: *** [Makefile:42: all-backend-recurse] Error 2
[23:04:05.514] make: *** [GNUmakefile:21: world-bin-src-recurse] Error 2
You might want to add a return to get rid of the warning.
Still in 0004:
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -339,6 +340,8 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
memcpy(subroot, root, sizeof(PlannerInfo));
subroot->query_level++;
subroot->parent_root = root;
+ subroot->plan_name = choose_plan_name(root->glob, "minmax", true);
+
/* reset subplan-related stuff */
subroot->plan_params = NIL;
subroot->outer_params = NULL;
@@ -359,6 +362,9 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
/* and we haven't created PlaceHolderInfos, either */
Assert(subroot->placeholder_list == NIL);
+ /* Add this to list of all PlannerInfo objects. */
+ root->glob->allroots = lappend(root->glob->allroots, root);
+
+++ b/src/backend/optimizer/plan/planagg.c
@@ -339,6 +340,8 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
memcpy(subroot, root, sizeof(PlannerInfo));
subroot->query_level++;
subroot->parent_root = root;
+ subroot->plan_name = choose_plan_name(root->glob, "minmax", true);
+
/* reset subplan-related stuff */
subroot->plan_params = NIL;
subroot->outer_params = NULL;
@@ -359,6 +362,9 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
/* and we haven't created PlaceHolderInfos, either */
Assert(subroot->placeholder_list == NIL);
+ /* Add this to list of all PlannerInfo objects. */
+ root->glob->allroots = lappend(root->glob->allroots, root);
+
In the last diff, it should add "subroot" instead of "root" to the
list of all PlannerInfos. Currently, if there are multiple minmax
expressions, we end up with the following plan showing duplicate
names:
list of all PlannerInfos. Currently, if there are multiple minmax
expressions, we end up with the following plan showing duplicate
names:
test=# explain (costs off) SELECT MIN(value), MAX(value) FROM test_minmax;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Result
Replaces: Aggregate
InitPlan minmax_1
-> Limit
-> Index Only Scan using test_minmax_value_idx on test_minmax
Index Cond: (value IS NOT NULL)
InitPlan minmax_1
-> Limit
-> Index Only Scan Backward using test_minmax_value_idx on test_minmax test_minmax_1
Index Cond: (value IS NOT NULL)
(10 rows)
QUERY PLAN
-------------------------------------------------------------------------------------------------
Result
Replaces: Aggregate
InitPlan minmax_1
-> Limit
-> Index Only Scan using test_minmax_value_idx on test_minmax
Index Cond: (value IS NOT NULL)
InitPlan minmax_1
-> Limit
-> Index Only Scan Backward using test_minmax_value_idx on test_minmax test_minmax_1
Index Cond: (value IS NOT NULL)
(10 rows)
Best,
Alex
pgsql-hackers by date: