Thread: Wrong results with grouping sets
shown by the query below.
-- result is correct with only grouping sets
select a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a));
a | b
---+---
1 | 1
1 |
2 | 2
2 |
(4 rows)
-- result is NOT correct with grouping sets and distinct on
select distinct on (a, b) a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a));
a | b
---+---
1 | 1
2 | 2
(2 rows)
The distinct on expressions include both 'a' and 'b', so rows (1, 1) and
(1, NULL) should not be considered equal. (The same for rows (2, 2) and
(2, NULL).)
I think the root cause is that when we generate distinct_pathkeys, we
failed to realize that Var 'b' might be nullable by the grouping sets,
so it's no longer always equal to Var 'a'. It's not correct to deem
that the PathKey for 'b' is redundant and thus remove it from the
pathkeys list.
We have the same issue when generating sort_pathkeys. As a result, we
may have the final output in the wrong order. There were several
reports about this issue before, such as [1][2].
To fix this issue, I'm thinking that we mark the grouping expressions
nullable by grouping sets with a dummy RTE for grouping sets, something
like attached. In practice we do not need to create a real RTE for
that, what we need is just a RT index. In the patch I use 0, because
it's not a valid outer join relid, so it would not conflict with the
outer-join-aware-Var infrastructure.
If the grouping expression is a Var or PHV, we can just set its
nullingrels, very straightforward. For an expression that is neither a
Var nor a PHV, I'm not quite sure how to set the nullingrels. I tried
the idea of wrapping it in a new PHV to carry the nullingrels, but that
may cause some unnecessary plan diffs. In the patch for such an
expression I just set the nullingrels of Vars or PHVs that are contained
in it. This is not really 'correct' in theory, because it is the whole
expression that can be nullable by grouping sets, not its individual
vars. But it works in practice, because what we need is that the
expression can be somehow distinguished from the same expression in ECs,
and marking its vars is sufficient for this purpose. But what if the
expression is variable-free? This is the point that needs more work.
Fow now the patch just handles variable-free expressions of type Const,
by wrapping it in a new PHV.
There are some places where we need to artificially remove the RT index
of grouping sets from the nullingrels, such as when we generate
PathTarget for initial input to grouping nodes, or when we generate
PathKeys for the grouping clauses, because the expressions there are
logically below the grouping sets. We also need to do that in
set_upper_references when we update the targetlist of an Agg node, so
that we can perform exact match for nullingrels, rather than superset
match.
Since the fix depends on the nullingrels stuff, it seems not easy for
back-patching. I'm not sure what we should do in back branches.
Any thoughts?
[1] https://www.postgresql.org/message-id/CAMbWs48AtQTQGk37MSyDk_EAgDO3Y0iA_LzvuvGQ2uO_Wh2muw@mail.gmail.com
[2] https://www.postgresql.org/message-id/17071-24dc13fbfa29672d@postgresql.org
Thanks
Richard
Attachment
I think the root cause is that when we generate distinct_pathkeys, we
failed to realize that Var 'b' might be nullable by the grouping sets,
so it's no longer always equal to Var 'a'. It's not correct to deem
that the PathKey for 'b' is redundant and thus remove it from the
pathkeys list.
We have the same issue when generating sort_pathkeys. As a result, we
may have the final output in the wrong order. There were several
reports about this issue before, such as [1][2].
To fix this issue, I'm thinking that we mark the grouping expressions
nullable by grouping sets with a dummy RTE for grouping sets, something
like attached.
What do you think about the fix?
Thanks
Richard
Hi! Thank you for your work on the subject.
I think I've come across a wrong result issue with grouping sets, as
shown by the query below.
-- result is correct with only grouping sets
select a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a));
a | b
---+---
1 | 1
1 |
2 | 2
2 |
(4 rows)
-- result is NOT correct with grouping sets and distinct on
select distinct on (a, b) a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a));
a | b
---+---
1 | 1
2 | 2
(2 rows)
The distinct on expressions include both 'a' and 'b', so rows (1, 1) and
(1, NULL) should not be considered equal. (The same for rows (2, 2) and
(2, NULL).)
I noticed that this query worked correctly in the main branch with the inequality operator:
postgres=# select distinct on (a, b) a, b from (values (3, 1), (2, 2)) as t (a, b) where a > b group by grouping sets((a, b), (a)); a | b ---+--- 3 | 1 3 | (2 rows)
So, I think you are right)
I think the root cause is that when we generate distinct_pathkeys, we
failed to realize that Var 'b' might be nullable by the grouping sets,
so it's no longer always equal to Var 'a'. It's not correct to deem
that the PathKey for 'b' is redundant and thus remove it from the
pathkeys list.
We have the same issue when generating sort_pathkeys. As a result, we
may have the final output in the wrong order. There were several
reports about this issue before, such as [1][2].
To fix this issue, I'm thinking that we mark the grouping expressions
nullable by grouping sets with a dummy RTE for grouping sets, something
like attached. In practice we do not need to create a real RTE for
that, what we need is just a RT index. In the patch I use 0, because
it's not a valid outer join relid, so it would not conflict with the
outer-join-aware-Var infrastructure.
If the grouping expression is a Var or PHV, we can just set its
nullingrels, very straightforward. For an expression that is neither a
Var nor a PHV, I'm not quite sure how to set the nullingrels. I tried
the idea of wrapping it in a new PHV to carry the nullingrels, but that
may cause some unnecessary plan diffs. In the patch for such an
expression I just set the nullingrels of Vars or PHVs that are contained
in it. This is not really 'correct' in theory, because it is the whole
expression that can be nullable by grouping sets, not its individual
vars. But it works in practice, because what we need is that the
expression can be somehow distinguished from the same expression in ECs,
and marking its vars is sufficient for this purpose. But what if the
expression is variable-free? This is the point that needs more work.
Fow now the patch just handles variable-free expressions of type Const,
by wrapping it in a new PHV.
There are some places where we need to artificially remove the RT index
of grouping sets from the nullingrels, such as when we generate
PathTarget for initial input to grouping nodes, or when we generate
PathKeys for the grouping clauses, because the expressions there are
logically below the grouping sets. We also need to do that in
set_upper_references when we update the targetlist of an Agg node, so
that we can perform exact match for nullingrels, rather than superset
match.
Since the fix depends on the nullingrels stuff, it seems not easy for
back-patching. I'm not sure what we should do in back branches.
Any thoughts?
[1] https://www.postgresql.org/message-id/CAMbWs48AtQTQGk37MSyDk_EAgDO3Y0iA_LzvuvGQ2uO_Wh2muw@mail.gmail.com
[2] https://www.postgresql.org/message-id/17071-24dc13fbfa29672d@postgresql.org
I looked at your patch and noticed a few things:
1. I think you should add a test with the cube operator, because I noticed that the order of the query in the result has also changed:
master:
postgres=# select a,b from (values(1,1),(2,2)) as t (a,b) where a=b group by cube (a, (a,b)) order by b,a; a | b ---+--- 1 | 1 1 | 1 1 | 2 | 2 2 | 2 2 | | (7 rows)
with patch:
postgres=# select a, b from (values (1, 1), (2, 2)) as t (a, b) where a = b group by cube(a, (a, b)) order by b,a; a | b ---+--- 1 | 1 1 | 1 2 | 2 2 | 2 1 | 2 | | (7 rows)
-- Regards, Alena Rybakina
I noticed that this query worked correctly in the main branch with the inequality operator:
postgres=# select distinct on (a, b) a, b from (values (3, 1), (2, 2)) as t (a, b) where a > b group by grouping sets((a, b), (a)); a | b ---+--- 3 | 1 3 | (2 rows)
So, I think you are right)
I looked at your patch and noticed a few things:
1. I think you should add a test with the cube operator, because I noticed that the order of the query in the result has also changed:
here is caused by the same reason explained above: the planner fails to
realize that Var 'a' and 'b' are nullable by the grouping sets, making
them no longer always equal to each other. This issue should have been
covered in the tests added by v1 patch.
Thanks
Richard
If the grouping expression is a Var or PHV, we can just set its
nullingrels, very straightforward. For an expression that is neither a
Var nor a PHV, I'm not quite sure how to set the nullingrels. I tried
the idea of wrapping it in a new PHV to carry the nullingrels, but that
may cause some unnecessary plan diffs. In the patch for such an
expression I just set the nullingrels of Vars or PHVs that are contained
in it. This is not really 'correct' in theory, because it is the whole
expression that can be nullable by grouping sets, not its individual
vars. But it works in practice, because what we need is that the
expression can be somehow distinguished from the same expression in ECs,
and marking its vars is sufficient for this purpose. But what if the
expression is variable-free? This is the point that needs more work.
Fow now the patch just handles variable-free expressions of type Const,
by wrapping it in a new PHV.
aggregates, or window functions, it would not be treated as a member of
EC that is redundant (see get_eclass_for_sort_expr()). That means it
would not be removed from the pathkeys list, so we do not need to set
the nullingrels for it. Otherwise we can just wrap the expression in a
PlaceHolderVar. Attached is an updated patch to do that.
BTW, this wrong results issue has existed ever since grouping sets was
introduced in v9.5, and there were field reports complaining about this
issue. I think it would be great if we can get rid of it. I'm still
not sure what we should do in back branches. But let's fix it at least
on v16 and later.
Thanks
Richard
Attachment
Richard Guo <guofenglinux@gmail.com> writes: > For a variable-free expression, if it contains volatile functions, SRFs, > aggregates, or window functions, it would not be treated as a member of > EC that is redundant (see get_eclass_for_sort_expr()). That means it > would not be removed from the pathkeys list, so we do not need to set > the nullingrels for it. Otherwise we can just wrap the expression in a > PlaceHolderVar. Attached is an updated patch to do that. I don't think this is going in quite the right direction. We have many serious problems with grouping sets (latest one today at [1]), and I don't believe that hacking around EquivalenceClasses is going to fix them all. I think that what we really need to do is invent a new kind of RTE representing the output of the grouping step, with columns that are the Vars or expressions being grouped on. Then we would make the parser actually replace subexpressions in the tlist with Vars referencing this new RTE (that is, change check_ungrouped_columns into something that modifies the expression tree into something that contains no Vars that aren't grouping-RTE Vars). In this way the output of the parser directly expresses the semantic requirement that certain subexpressions be gotten from the grouping output rather than computed some other way. The trick is to do this without losing optimization capability. We could have the planner replace these Vars with the underlying Vars in cases where it's safe to do so (perhaps after adding a nullingrel bit that references the grouping RTE). If a grouping column is an expression, we might be able to replace the reference Vars with PHVs as you've done here ... but I think we need the parser infrastructure fixed first. regards, tom lane [1] https://www.postgresql.org/message-id/flat/CAEzk6fcgXWabEG%2BRFDaG6tDmFX6g1h7LPGUdrX85Pb0XB3B76g%40mail.gmail.com
On Thu, 7 Dec 2023 at 13:52, Richard Guo <guofenglinux@gmail.com> wrote: > > > On Mon, Sep 25, 2023 at 3:11 PM Richard Guo <guofenglinux@gmail.com> wrote: >> >> If the grouping expression is a Var or PHV, we can just set its >> nullingrels, very straightforward. For an expression that is neither a >> Var nor a PHV, I'm not quite sure how to set the nullingrels. I tried >> the idea of wrapping it in a new PHV to carry the nullingrels, but that >> may cause some unnecessary plan diffs. In the patch for such an >> expression I just set the nullingrels of Vars or PHVs that are contained >> in it. This is not really 'correct' in theory, because it is the whole >> expression that can be nullable by grouping sets, not its individual >> vars. But it works in practice, because what we need is that the >> expression can be somehow distinguished from the same expression in ECs, >> and marking its vars is sufficient for this purpose. But what if the >> expression is variable-free? This is the point that needs more work. >> Fow now the patch just handles variable-free expressions of type Const, >> by wrapping it in a new PHV. > > > For a variable-free expression, if it contains volatile functions, SRFs, > aggregates, or window functions, it would not be treated as a member of > EC that is redundant (see get_eclass_for_sort_expr()). That means it > would not be removed from the pathkeys list, so we do not need to set > the nullingrels for it. Otherwise we can just wrap the expression in a > PlaceHolderVar. Attached is an updated patch to do that. > > BTW, this wrong results issue has existed ever since grouping sets was > introduced in v9.5, and there were field reports complaining about this > issue. I think it would be great if we can get rid of it. I'm still > not sure what we should do in back branches. But let's fix it at least > on v16 and later. I have changed the status of the patch to "Waiting on Author" as Tom Lane's comments have not yet been addressed, feel free to address them and update the commitfest entry accordingly. Regards, Vignesh
> On 11 Jan 2024, at 20:10, vignesh C <vignesh21@gmail.com> wrote: > > I have changed the status of the patch to "Waiting on Author" as Tom > Lane's comments have not yet been addressed, feel free to address them > and update the commitfest entry accordingly. This CF entry seems to be a fix for actually unexpected behaviour. But seems like we need another fix. Richard, Alena, what do you think? Should we mark CF entry [0] "RwF" or leave it to wait for better fix? Best regards, Andrey Borodin. [0] https://commitfest.postgresql.org/47/4583/
I don't think this is going in quite the right direction. We have
many serious problems with grouping sets (latest one today at [1]),
and I don't believe that hacking around EquivalenceClasses is going
to fix them all.
I think that what we really need to do is invent a new kind of RTE
representing the output of the grouping step, with columns that
are the Vars or expressions being grouped on. Then we would make
the parser actually replace subexpressions in the tlist with Vars
referencing this new RTE (that is, change check_ungrouped_columns
into something that modifies the expression tree into something that
contains no Vars that aren't grouping-RTE Vars). In this way the
output of the parser directly expresses the semantic requirement that
certain subexpressions be gotten from the grouping output rather than
computed some other way.
The trick is to do this without losing optimization capability.
We could have the planner replace these Vars with the underlying
Vars in cases where it's safe to do so (perhaps after adding a
nullingrel bit that references the grouping RTE). If a grouping
column is an expression, we might be able to replace the reference
Vars with PHVs as you've done here ... but I think we need the
parser infrastructure fixed first.
I think you're right. To fix the cases where there are subqueries in
the grouping sets, as in Geoff's example, it seems that we'll have to
fix the parser infrastructure by inventing a new RTE for the grouping
step and replacing the subquery expressions with Vars referencing this
new RTE, so that there is only one instance of the subquery in the
parser output.
I have experimented with this approach, and here is the outcome. The
patch fixes Geoff's query, but it's still somewhat messy as I'm not
experienced enough in the parser code. And the patch has not yet
implemented the nullingrel bit manipulation trick. Before proceeding
further with this patch, I'd like to know if it is going in the right
direction.
Thanks
Richard
Attachment
I have experimented with this approach, and here is the outcome. The
patch fixes Geoff's query, but it's still somewhat messy as I'm not
experienced enough in the parser code. And the patch has not yet
implemented the nullingrel bit manipulation trick. Before proceeding
further with this patch, I'd like to know if it is going in the right
direction.
regression tests. But I had to hack explain.c and ruleutils.c to make
the varprefix stuff work as it did before, which is not great.
One thing I'm not sure about is whether we need to also replace
subexpressions in the arguments of GroupingFunc nodes with Vars
referencing the new GROUP RTE. These arguments would not be executed at
runtime, so it seems that we can just replace them. I tried to do that
and found several plan changes in the regression tests. Such as
explain (verbose, costs off)
select grouping(ss.x)
from int8_tbl i1
cross join lateral (select (select i1.q1) as x) ss
group by ss.x;
QUERY PLAN
------------------------------------------------
GroupAggregate
Output: GROUPING((SubPlan 1)), ((SubPlan 2))
Group Key: ((SubPlan 2))
-> Sort
Output: ((SubPlan 2)), i1.q1
Sort Key: ((SubPlan 2))
-> Seq Scan on public.int8_tbl i1
Output: (SubPlan 2), i1.q1
SubPlan 2
-> Result
Output: i1.q1
(11 rows)
If we substitute the subquery expression in the argument of GroupingFunc
with the GROUP RTE's Var, the final plan would contain only one SubPlan
instead of two.
can be done with the code in my v2 patch. But I think we'd better get
the parser fixed first before stepping into that.
Also please ignore the comment and code format things in the patch as I
haven't worked on them.
Thanks
Richard
Attachment
I've spent some more time on this patch, and now it passes all the
regression tests. But I had to hack explain.c and ruleutils.c to make
the varprefix stuff work as it did before, which is not great.
alias vars in the targetlist and HAVING clause, we should first flatten
them before replacing the grouped variables involved there with
grouping-RTE Vars. To fix this issue, I decide to merge the newly added
function substitute_group_exprs into check_ungrouped_columns by changing
check_ungrouped_columns to also perform the replacement, which is Tom's
initial suggestion I think.
Now it seems that 'check_ungrouped_columns' is no longer an appropriate
name for the function. So I rename it to 'substitute_grouped_columns'.
But I'm open to other names if there are any suggestions.
I've also worked on the comments.
Thanks
Richard
Attachment
nullingrel bit that references the grouping RTE to the grouping
expressions.
However, it seems to me that we have to manually remove this nullingrel
bit from expressions in various cases where these expressions are
logically below the grouping step, such as when we generate groupClause
pathkeys for grouping sets, or when we generate PathTarget for initial
input to grouping nodes.
Furthermore, in set_upper_references, the targetlist and quals of an Agg
node should have nullingrels that include the effects of the grouping
step, ie they will have nullingrels equal to the input Vars/PHVs'
nullingrels plus the nullingrel bit that references the grouping RTE.
In order to perform exact nullingrels matches, I think we also need to
manually remove this nullingrel bit.
Thanks
Richard
Attachment
On Fri, May 24, 2024 at 9:08 PM Richard Guo <guofenglinux@gmail.com> wrote: > On the basis of the parser infrastructure fixup, 0002 patch adds the > nullingrel bit that references the grouping RTE to the grouping > expressions. I found a bug in the v6 patch. The following query would trigger the Assert in make_restrictinfo that the given subexpression should not be an AND clause. select max(a) from t group by a > b and a = b having a > b and a = b; This is because the expression 'a > b and a = b' in the HAVING clause is replaced by a Var that references the GROUP RTE. When we preprocess the columns of the GROUP RTE, we do not know whether the grouped expression is a havingQual or not, so we do not perform make_ands_implicit for it. As a result, after we replace the group Var in the HAVING clause with the underlying grouping expression, we will have a havingQual that is an AND clause. As we know, in the planner we need to first preprocess all the columns of the GROUP RTE. We also need to replace any Vars in the targetlist and HAVING clause that reference the GROUP RTE with the underlying grouping expressions. To fix the mentioned issue, I choose the perform this replacement before we preprocess the targetlist and havingQual, so that the make_ands_implicit would be performed when we preprocess the havingQual. One problem with this is, when we preprocess the targetlist and havingQual, we would see already-planned tree, which is generated by the preprocessing work for the grouping expressions and then substituted for the GROUP Vars in the targetlist and havingQual. This would break the Assert 'Assert(!IsA(node, SubPlan))' in flatten_join_alias_vars_mutator and process_sublinks_mutator. I think we can just return the already-planned tree unchanged when we see it in the preprocessing process. Hence here is the v7 patchset. I've also added detailed commit messages for the two patches. Thanks Richard
Attachment
On Wed, Jun 5, 2024 at 5:42 PM Richard Guo <guofenglinux@gmail.com> wrote: > Hence here is the v7 patchset. I've also added detailed commit messages > for the two patches. This patchset does not apply any more. Here is a new rebase. While at it, I added more checks for 'root->group_rtindex', and also added a new test case to verify that we generate window_pathkeys correctly with grouping sets. Thanks Richard
Attachment
On Mon, Jun 10, 2024 at 5:05 PM Richard Guo <guofenglinux@gmail.com> wrote: > This patchset does not apply any more. Here is a new rebase. Here is an updated version of this patchset. I've run pgindent for it, and also tweaked the commit messages a bit. In principle, 0001 can be backpatched to all supported versions to fix the cases where there are subqueries in the grouping expressions; 0002 can be backpatched to 16 where we have the nullingrels stuff. But both patches seem to be quite invasive. I'm not sure if we want to backpatch them to stable branches. Any thoughts about backpatching? Thanks Richard
Attachment
On Mon, Jul 1, 2024 at 1:59 PM Richard Guo <guofenglinux@gmail.com> wrote: > > On Mon, Jun 10, 2024 at 5:05 PM Richard Guo <guofenglinux@gmail.com> wrote: > > This patchset does not apply any more. Here is a new rebase. > > Here is an updated version of this patchset. I've run pgindent for it, > and also tweaked the commit messages a bit. > > In principle, 0001 can be backpatched to all supported versions to fix > the cases where there are subqueries in the grouping expressions; 0002 > can be backpatched to 16 where we have the nullingrels stuff. But both > patches seem to be quite invasive. I'm not sure if we want to backpatch > them to stable branches. Any thoughts about backpatching? I don't have any specific thoughts on backpatching, but I have started reviewing the patches. The first patch in the set adds a new RTEKind for GROUP. From prologue of RangeTblEntry structure I can not understand what an RTE represents especially when the RTE represents something other than a FROM clause item. ``` * This is because we only need the RTE to deal with SQL features * like outer joins and join-output-column aliasing.) Other special * RTE types also exist, as indicated by RTEKind. ``` I can not use this description to decide whether a GROUP BY construct should have an RTE for itself or not. It looks like the patch adds a new RTE (kind) here so that its rtindex can be used to differentiate between a and b from VALUES clause and those from the GroupingSet result in the query mentioned in the first email in this thread. But I don't see any discussion of other alternatives. For example, how about infrastructure in EC to tell which stages this EC is valid for/upto? I see Tom suggesting use of RTE instead of changing EC but I don't see why that's better. We do mark a RestrictInfo with relids above which it can be computed. Similarly we assess validity of EC by stages or relations being computed. That might open some opportunities for using broken ECs? We are almost reimplementing parts of the GROUPING set feature, so may be it's worth spending time thinking about it. Assuming new RTEkind is the right approach, I am wondering whether there are other things that should have been represented by RTE for the same reason. For example, a HAVING clause changes the characteristics of results by introducing new constraints on the aggregated results. Should that have an RTE by itself? Will the RTEKind introduced by this patch cover HAVING clause as well? Will that open opportunities for more optimizations E.g. ``` explain select sum(a), sum(b), stddev(a + b) from (values (1, 1), (2, 2)) as t(a, b) group by a, b having sum(a) = sum(b) order by 1, 2; QUERY PLAN ------------------------------------------------------------------------- Sort (cost=0.10..0.10 rows=1 width=56) Sort Key: (sum("*VALUES*".column1)), (sum("*VALUES*".column2)) -> HashAggregate (cost=0.06..0.09 rows=1 width=56) Group Key: "*VALUES*".column1, "*VALUES*".column2 Filter: (sum("*VALUES*".column1) = sum("*VALUES*".column2)) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8) (6 rows) ``` Sort Key can be just (sum("*VALUES*".column1)) instead of both (sum("*VALUES*".column1)), (sum("*VALUES*".column2)) because of HAVING clause? Some code level random comments 1. ``` if (rte->rtekind == RTE_GROUP) { es->rtable_size--; break; ``` because of the variable name, it would be interpreted as the size of es->rtable and will be expected to be the same as list_length(es->rtable) which it is not. The comment at the member declaration won't help much for a quick reader. All that variable is doing is to tell us whether to use alias as prefix or not; `useprefix = es->rtable_size > 1;` OR useprefix = (es->rtable_size > 1 || es->verbose);. Instead of rtable_size, we could let the new member track the fact whether there are multiple aliases in the query (requiring multiple prefixes) instead of size of rtable. However, the fact that the GROUP RTE requires special handling indicates that the new RTEKind doesn't quite fit the rest of the set. No other RTE, even if outside FROM clause, required this special handling. 2. expandRecordVariable: The comment below the change should be updated to explain why an output of GROUPing can not have RECORD or at least mention GROUPING there. 3. I see code like below in get_eclass_for_sort_expr() and mark_rels_nulled_by_join() ``` /* ignore GROUP RTE */ if (i == root->group_rtindex) continue; ``` I assume that rel for this RTE index would be NULL, so "if" block just below this code would get executed. I think we should just change Assert() in that code block rather than adding a new "if" block to avoid confusion. 4. Looking at parse_clause.c most (if not all) addRangeTableEntry* function calls are from transform* functions. On those lines, I expected addRangeTableEntryForGroup() to be called from transformGroupClause(). Why are we calling addRangeTableEntryForGroup() from parseCheckAggregates()? 5. In the parseCheckAggregates, we are replacing expressions from targetlist and havingQual with Vars pointing to GROUP RTE. But we are not doing that to sortClause, the remaining SQL construct. That's because sortClause is just a list of entries pointing back to targetlist. So there's nothing to change there. Am I right? 6. I think ParseState::p_grouping_nsitem should be collocated with other ParseNamespaceItem members or lists in ParseState. I think it serves a similar purpose as them. Similarly PlannerInfo::group_rtindex should be placed next to outer_join_rels? 7. Do we need RangeTblEntry::groupexprs as a separate member? They are the same as GROUP BY or GROUPING SET expressions. So the list can be constructed from groupClause whenever required. Do we need to maintain the list separately? I am comparing with other RTEs, say Subquery RTE. We don't copy all the targetlist expressions from subquery to subquery's RTE. I noticed that groupexprs are being treated on lines similar to joinaliasvars. But they are somewhat different. The latter is a unified representation of columns of joining relations different from those columns and hence needs a new representation. That doesn't seem to be the case with RangeTblEntry::groupexpr. 8. The change in process_sublinks_mutator() appears to be related to the fact that GROUPING() may have subqueries which were not being handled earlier. That change seems to be independent of the bug being fixed here. Am I right? If yes, having those changes in a separate patch will help. -- Best Wishes, Ashutosh Bapat
On 2024-07-01 16:29:16 +0800, Richard Guo wrote: > On Mon, Jun 10, 2024 at 5:05 PM Richard Guo <guofenglinux@gmail.com> wrote: > > This patchset does not apply any more. Here is a new rebase. > > Here is an updated version of this patchset. I've run pgindent for it, > and also tweaked the commit messages a bit. > > In principle, 0001 can be backpatched to all supported versions to fix > the cases where there are subqueries in the grouping expressions; 0002 > can be backpatched to 16 where we have the nullingrels stuff. But both > patches seem to be quite invasive. I'm not sure if we want to backpatch > them to stable branches. Any thoughts about backpatching? As-is they can't be backpatched, unless I am missing something? Afaict they introduce rather thorough ABI breaks? And API breaks, actually? Greetings, Andres Freund
On Thu, Jul 4, 2024 at 6:02 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > On Mon, Jul 1, 2024 at 1:59 PM Richard Guo <guofenglinux@gmail.com> wrote: > > Here is an updated version of this patchset. I've run pgindent for it, > > and also tweaked the commit messages a bit. > > > > In principle, 0001 can be backpatched to all supported versions to fix > > the cases where there are subqueries in the grouping expressions; 0002 > > can be backpatched to 16 where we have the nullingrels stuff. But both > > patches seem to be quite invasive. I'm not sure if we want to backpatch > > them to stable branches. Any thoughts about backpatching? > > I don't have any specific thoughts on backpatching, but I have started > reviewing the patches. Thanks for reviewing this patchset! > The first patch in the set adds a new RTEKind for GROUP. From prologue > of RangeTblEntry structure I can not understand what an RTE represents > especially when the RTE represents something other than a FROM clause > item. > ``` > * This is because we only need the RTE to deal with SQL features > * like outer joins and join-output-column aliasing.) Other special > * RTE types also exist, as indicated by RTEKind. > ``` > I can not use this description to decide whether a GROUP BY construct > should have an RTE for itself or not. It looks like the patch adds a > new RTE (kind) here so that its rtindex can be used to differentiate > between a and b from VALUES clause and those from the GroupingSet > result in the query mentioned in the first email in this thread. But I > don't see any discussion of other alternatives. For example, how about > infrastructure in EC to tell which stages this EC is valid for/upto? I > see Tom suggesting use of RTE instead of changing EC but I don't see > why that's better. We do mark a RestrictInfo with relids above which > it can be computed. Similarly we assess validity of EC by stages or > relations being computed. That might open some opportunities for using > broken ECs? We are almost reimplementing parts of the GROUPING set > feature, so may be it's worth spending time thinking about it. The reason why we need a new RTE for the grouping step is to address cases where there are subqueries in the grouping expressions. In such cases, each of these subqueries in the targetlist and HAVING clause is expanded into distinct SubPlan nodes. Only one of these SubPlan nodes would be converted to reference to the grouping key column output by the Agg node; others would have to get evaluated afresh, and might not go to NULL when they are supposed to. I do not think this can be addressed by changing ECs. > Assuming new RTEkind is the right approach, I am wondering whether > there are other things that should have been represented by RTE for > the same reason. For example, a HAVING clause changes the > characteristics of results by introducing new constraints on the > aggregated results. Should that have an RTE by itself? Will the > RTEKind introduced by this patch cover HAVING clause as well? AFAIU, HAVING clauses are just quals applied to the grouped rows after groups and aggregates are computed. I cannot see why and how to add a new RTE for HAVING. > ``` > explain select sum(a), sum(b), stddev(a + b) from (values (1, 1), (2, > 2)) as t(a, b) group by a, b having sum(a) = sum(b) order by 1, 2; > QUERY PLAN > ------------------------------------------------------------------------- > Sort (cost=0.10..0.10 rows=1 width=56) > Sort Key: (sum("*VALUES*".column1)), (sum("*VALUES*".column2)) > -> HashAggregate (cost=0.06..0.09 rows=1 width=56) > Group Key: "*VALUES*".column1, "*VALUES*".column2 > Filter: (sum("*VALUES*".column1) = sum("*VALUES*".column2)) > -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8) > (6 rows) > ``` > Sort Key can be just (sum("*VALUES*".column1)) instead of both > (sum("*VALUES*".column1)), (sum("*VALUES*".column2)) because of HAVING > clause? This looks like an optimization that can be achieved by hacking around ECs. I'm not sure. But I think adding new RTEs does not help here. > Some code level random comments > 1. > ``` > if (rte->rtekind == RTE_GROUP) > { > es->rtable_size--; > break; > ``` > because of the variable name, it would be interpreted as the size of > es->rtable and will be expected to be the same as > list_length(es->rtable) which it is not. The comment at the member > declaration won't help much for a quick reader. All that variable is > doing is to tell us whether to use alias as prefix or not; > `useprefix = es->rtable_size > 1;` OR useprefix = (es->rtable_size > 1 > || es->verbose);. > Instead of rtable_size, we could let the new member track the fact > whether there are multiple aliases in the query (requiring multiple > prefixes) instead of size of rtable. However, the fact that the GROUP > RTE requires special handling indicates that the new RTEKind doesn't > quite fit the rest of the set. No other RTE, even if outside FROM > clause, required this special handling. AFAIU we want to print prefixes on Vars when there are more than one RTE entries to indicate which column is from which RTE entry. If there is only one RTE (and not verbose), we try to avoid the prefixes. This patch adds a new dummy RTE, resulting in plans that previously had one RTE now having two and starting to print prefixes. This has caused a lot of plan diffs in regression tests. That's why this patch has to hack explain.c and ruleutils.c to make the varprefix stuff work as it did before. But I do not think this is alone for the new RTE. Consider explain (costs off) select sum(a) from (select * from t) having sum(a) = 1; QUERY PLAN -------------------------- Aggregate Filter: (sum(t.a) = 1) -> Seq Scan on t (3 rows) BTW, not related to the discussion here, I noticed an inconsistency regarding the varprefix in the qual and targetlist. Look at: explain (verbose, costs off) select sum(a) from t having sum(a) = 1; QUERY PLAN ---------------------------- Aggregate Output: sum(a) Filter: (sum(t.a) = 1) -> Seq Scan on public.t Output: a (5 rows) In the 'Filter' we add the prefix while in the 'Output' we do not. Does anyone think this is something worth investigating? > 2. expandRecordVariable: The comment below the change should be > updated to explain why an output of GROUPing can not have RECORD or at > least mention GROUPING there. Thanks. Will add some comments here. > 3. I see code like below in get_eclass_for_sort_expr() and > mark_rels_nulled_by_join() > ``` > /* ignore GROUP RTE */ > if (i == root->group_rtindex) > continue; > ``` > I assume that rel for this RTE index would be NULL, so "if" block just > below this code would get executed. I think we should just change > Assert() in that code block rather than adding a new "if" block to > avoid confusion. Actually I initially coded it as you suggested, and then moved the check for the RTE_GROUP RTE out of the 'if' block later, in order to maintain separate logic for GROUP RTE and outer joins. I'm not quite sure which is better. > 4. Looking at parse_clause.c most (if not all) addRangeTableEntry* > function calls are from transform* functions. On those lines, I > expected addRangeTableEntryForGroup() to be called from > transformGroupClause(). Why are we calling > addRangeTableEntryForGroup() from parseCheckAggregates()? I think this is the most handy place to add the RTE_GROUP RTE, as the join_flattened grouping expressions are available here. > 5. In the parseCheckAggregates, we are replacing expressions from > targetlist and havingQual with Vars pointing to GROUP RTE. But we are > not doing that to sortClause, the remaining SQL construct. That's > because sortClause is just a list of entries pointing back to > targetlist. So there's nothing to change there. Am I right? Well, it's not about that. Actually groupClause is also 'a list of entries pointing back to targetlist'. The primary reason is that the grouping step may result in some grouping expressions being set to NULL, whereas the sorting step does not have this behavior. > 6. I think ParseState::p_grouping_nsitem should be collocated with > other ParseNamespaceItem members or lists in ParseState. I think it > serves a similar purpose as them. Similarly PlannerInfo::group_rtindex > should be placed next to outer_join_rels? I agree that ParseState.p_grouping_nsitem should be moved to a more proper place, and we should mention it in the comment for ParseState too. But I'm not sure about the root.group_rtindex. I will give it another thought later. > 7. Do we need RangeTblEntry::groupexprs as a separate member? They are > the same as GROUP BY or GROUPING SET expressions. So the list can be > constructed from groupClause whenever required. Do we need to maintain > the list separately? I am comparing with other RTEs, say Subquery RTE. > We don't copy all the targetlist expressions from subquery to > subquery's RTE. I noticed that groupexprs are being treated on lines > similar to joinaliasvars. But they are somewhat different. The latter > is a unified representation of columns of joining relations different > from those columns and hence needs a new representation. That doesn't > seem to be the case with RangeTblEntry::groupexpr. We need to preprocess the grouping expressions first and then substitute them back into the targetlist and havingQual. I don't think this can be achieved without keeping groupexprs as a separate member. > 8. The change in process_sublinks_mutator() appears to be related to > the fact that GROUPING() may have subqueries which were not being > handled earlier. That change seems to be independent of the bug being > fixed here. Am I right? If yes, having those changes in a separate > patch will help. No, I don't think so. Without this patch we should never see a SubPlan/AlternativeSubPlan expression in process_sublinks_mutator, because this is where SubPlans are created. Thanks Richard
On Fri, Jul 5, 2024 at 5:51 AM Andres Freund <andres@anarazel.de> wrote: > On 2024-07-01 16:29:16 +0800, Richard Guo wrote: > > Here is an updated version of this patchset. I've run pgindent for it, > > and also tweaked the commit messages a bit. > > > > In principle, 0001 can be backpatched to all supported versions to fix > > the cases where there are subqueries in the grouping expressions; 0002 > > can be backpatched to 16 where we have the nullingrels stuff. But both > > patches seem to be quite invasive. I'm not sure if we want to backpatch > > them to stable branches. Any thoughts about backpatching? > > As-is they can't be backpatched, unless I am missing something? Afaict they > introduce rather thorough ABI breaks? And API breaks, actually? Indeed, you're correct. I did not think about this. This patchset modifies certain struct definitions in src/include/ and also changes the signature of several functions, resulting in definite ABI and API breaks. BTW, from catversion.h I read: * Another common reason for a catversion update is a change in parsetree * external representation, since serialized parsetrees appear in stored * rules and new-style SQL functions. Almost any change in primnodes.h or * parsenodes.h will warrant a catversion update. Since this patchset changes the querytree produced by the parser, does this indicate that a catversion bump is needed? Thanks Richard
Richard Guo <guofenglinux@gmail.com> writes: > BTW, from catversion.h I read: > * Another common reason for a catversion update is a change in parsetree > * external representation, since serialized parsetrees appear in stored > * rules and new-style SQL functions. Almost any change in primnodes.h or > * parsenodes.h will warrant a catversion update. > Since this patchset changes the querytree produced by the parser, does > this indicate that a catversion bump is needed? Yes, it would. regards, tom lane
On Sat, Jul 6, 2024 at 10:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Richard Guo <guofenglinux@gmail.com> writes: > > BTW, from catversion.h I read: > > > * Another common reason for a catversion update is a change in parsetree > > * external representation, since serialized parsetrees appear in stored > > * rules and new-style SQL functions. Almost any change in primnodes.h or > > * parsenodes.h will warrant a catversion update. > > > Since this patchset changes the querytree produced by the parser, does > > this indicate that a catversion bump is needed? > > Yes, it would. Thank you for confirming. Thanks Richard
On Sat, Jul 6, 2024 at 9:26 AM Richard Guo <guofenglinux@gmail.com> wrote: > On Thu, Jul 4, 2024 at 6:02 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > I don't have any specific thoughts on backpatching, but I have started > > reviewing the patches. > Thanks for reviewing this patchset! Here is an updated version of this patchset. I've added some comments according to the review feedback, and also tweaked the commit messages a bit more. Additionally, I've made a change to only add the new RTE_GROUP RTE when there are acceptable GROUP BY expressions. This allows us to skip all the trouble of doing this for queries without GROUP BY clauses. Thanks Richard
Attachment
FWIW, in addition to fixing wrong result issues for queries with grouping sets, the changes in 0001 also improve performance for queries that have subqueries in the grouping expressions, because different instances of the same subquery would need to be executed only once. As a simple example, consider create table t (a int, b int); insert into t select i, i from generate_series(1,10000)i; analyze t; -- on patched explain (analyze, costs off) select (select t1.b from t t2 where a = t1.a) as s1, (select t1.b from t t2 where a = t1.a) as s2, (select t1.b from t t2 where a = t1.a) as s3 from t t1 group by a, s1; QUERY PLAN ------------------------------------------------------------------------------------ Group (actual time=20475.028..20480.543 rows=10000 loops=1) Group Key: t1.a, ((SubPlan 1)) -> Sort (actual time=20475.017..20475.821 rows=10000 loops=1) Sort Key: t1.a, ((SubPlan 1)) Sort Method: quicksort Memory: 697kB -> Seq Scan on t t1 (actual time=7.435..20468.599 rows=10000 loops=1) SubPlan 1 -> Seq Scan on t t2 (actual time=1.022..2.045 rows=1 loops=10000) Filter: (a = t1.a) Rows Removed by Filter: 9999 Planning Time: 1.561 ms Execution Time: 20481.933 ms (12 rows) -- on master explain (analyze, costs off) select (select t1.b from t t2 where a = t1.a) as s1, (select t1.b from t t2 where a = t1.a) as s2, (select t1.b from t t2 where a = t1.a) as s3 from t t1 group by a, s1; QUERY PLAN ------------------------------------------------------------------------------------ Group (actual time=20779.318..62233.526 rows=10000 loops=1) Group Key: t1.a, ((SubPlan 1)) -> Sort (actual time=20775.125..20777.936 rows=10000 loops=1) Sort Key: t1.a, ((SubPlan 1)) Sort Method: quicksort Memory: 697kB -> Seq Scan on t t1 (actual time=7.492..20770.060 rows=10000 loops=1) SubPlan 1 -> Seq Scan on t t2 (actual time=1.037..2.075 rows=1 loops=10000) Filter: (a = t1.a) Rows Removed by Filter: 9999 SubPlan 2 -> Seq Scan on t t2_1 (actual time=1.037..2.071 rows=1 loops=10000) Filter: (a = t1.a) Rows Removed by Filter: 9999 SubPlan 3 -> Seq Scan on t t2_2 (actual time=1.037..2.071 rows=1 loops=10000) Filter: (a = t1.a) Rows Removed by Filter: 9999 Planning Time: 1.286 ms Execution Time: 62235.753 ms (20 rows) We can see that with the 0001 patch, this query runs ~3 times faster, which is no surprise because there are 3 instances of the same subquery in the targetlist. Thanks Richard
Hi, I'm reviewing patches in Commitfest 2024-07 from top to bottom: https://commitfest.postgresql.org/48/ This is the 3rd patch: https://commitfest.postgresql.org/48/4583/ FYI: https://commitfest.postgresql.org/48/4681/ is my patch. In <CAMbWs49RNmFhgDzoL=suWJrCSk-wizXa6uVtp0Jmz0z+741nSA@mail.gmail.com> "Re: Wrong results with grouping sets" on Wed, 10 Jul 2024 09:22:54 +0800, Richard Guo <guofenglinux@gmail.com> wrote: > Here is an updated version of this patchset. I've added some comments > according to the review feedback, and also tweaked the commit messages > a bit more. I'm not familiar with related codes but here are my comments: 0001: --- diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 85a62b538e5..8055f4b2b9e 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1242,6 +1245,12 @@ typedef struct RangeTblEntry /* estimated or actual from caller */ Cardinality enrtuples pg_node_attr(query_jumble_ignore); + /* + * Fields valid for a GROUP RTE (else NULL/zero): + */ + /* list of expressions grouped on */ + List *groupexprs pg_node_attr(query_jumble_ignore); + /* * Fields valid in all RTEs: */ ---- + * Fields valid for a GROUP RTE (else NULL/zero): There is only one field and it's LIST. So how about using the following? * A field valid for a GROUP RTE (else NIL): ---- diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c index 844fc30978b..0982f873a42 100644 --- a/src/backend/optimizer/util/var.c +++ b/src/backend/optimizer/util/var.c @@ -902,6 +915,141 @@ flatten_join_alias_vars_mutator(Node *node, ... +Node * +flatten_group_exprs(PlannerInfo *root, Query *query, Node *node) +{ + flatten_join_alias_vars_context context; ... --- If we want to reuse flatten_join_alias_vars_context for flatten_group_exprs(), how about renaming it? flatten_join_alias_vars() only uses flatten_join_alias_vars_context for now. So the name of flatten_join_alias_vars_context is meaningful. But if we want to flatten_join_alias_vars_context for flatten_group_exprs() too. The name of flatten_join_alias_vars_context is strange. ---- diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c index 2f64eaf0e37..69476384252 100644 --- a/src/backend/parser/parse_relation.c +++ b/src/backend/parser/parse_relation.c @@ -2557,6 +2557,79 @@ addRangeTableEntryForENR(ParseState *pstate, ... + char *colname = te->resname ? pstrdup(te->resname) : "unamed_col"; ... ---- Can the "te->resname == NULL" case be happen? If so, how about adding a new test for the case? (BTW, is "unamed_col" intentional name? Is it a typo of "unnamed_col"?) Thanks, -- kou
On Mon, Jul 15, 2024 at 4:38 PM Sutou Kouhei <kou@clear-code.com> wrote: > I'm not familiar with related codes but here are my > comments: Thanks for reviewing this patchset! > + * Fields valid for a GROUP RTE (else NULL/zero): > > There is only one field and it's LIST. So how about using > the following? > > * A field valid for a GROUP RTE (else NIL): Good point. I ended up with * Fields valid for a GROUP RTE (else NIL): ... since this is the pattern used by other types of RTEs that have only one field. > If we want to reuse flatten_join_alias_vars_context for > flatten_group_exprs(), how about renaming it? > flatten_join_alias_vars() only uses > flatten_join_alias_vars_context for now. So the name of > flatten_join_alias_vars_context is meaningful. But if we > want to flatten_join_alias_vars_context for > flatten_group_exprs() too. The name of > flatten_join_alias_vars_context is strange. I think the current name should be fine. It's not uncommon that we reuse the same structure intended for other functions within one function. > Can the "te->resname == NULL" case be happen? If so, how > about adding a new test for the case? It's quite common for te->resname to be NULL, such as when TargetEntry is resjunk. I don't think a new case for this is needed. It should already be covered in lots of instances in the current regression tests. > (BTW, is "unamed_col" intentional name? Is it a typo of > "unnamed_col"?) Yeah, it's a typo. I changed it to be "?column?", which is the default name if FigureColname can't guess anything. Here is an updated version of this patchset. I'm seeking the possibility to push this patchset sometime this month. Please let me know if anyone thinks this is unreasonable. Thanks Richard
Attachment
On Mon, Jul 15, 2024 at 8:15 AM Richard Guo <guofenglinux@gmail.com> wrote: > > We can see that with the 0001 patch, this query runs ~3 times faster, > which is no surprise because there are 3 instances of the same > subquery in the targetlist. I am not sure if that's the right thing to do. I am using a slightly elaborate version of the tests in your patch #select v, grouping(v) gv, grouping((select t1.v from gstest5 t2 where id = t1.id)) gs,grouping((select t1.v from gstest5 t2 where id = t1.id)) gs2, (select t1.v from gstest5 t2 where id = t1.id) as s, case when grouping(v) = 0 then v else null end as cv, case when grouping((select t1.v from gstest5 t2 where id = t1.id)) = 0 then (select t1.v from gstest5 t2 where id = t1.id) else null end as cs from gstest5 t1 group by grouping sets(v, s) ; v | gv | gs | gs2 | s | cv | cs ---+----+----+-----+---+----+---- 3 | 0 | 1 | 1 | | 3 | 5 | 0 | 1 | 1 | | 5 | 4 | 0 | 1 | 1 | | 4 | 2 | 0 | 1 | 1 | | 2 | 1 | 0 | 1 | 1 | | 1 | | 1 | 0 | 0 | 2 | | 2 | 1 | 0 | 0 | 5 | | 5 | 1 | 0 | 0 | 4 | | 4 | 1 | 0 | 0 | 3 | | 3 | 1 | 0 | 0 | 1 | | 1 (10 rows) #explain verbose select v, grouping(v) gv, grouping((select t1.v from gstest5 t2 where id = t1.id)) gs,grouping((select t1.v from gstest5 t2 w here id = t1.id)) gs2, (select t1.v from gstest5 t2 where id = t1.id) as s, case when grouping(v) = 0 then v else null end as cv, case when grouping((select t1.v from gstest5 t2 where id = t1.id)) = 0 then (select t1.v from gstest5 t2 where id = t1.id) else null end as cs from gstest5 t1 group by grouping sets(v, s) ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------- HashAggregate (cost=18508.10..58790.10 rows=2460 width=28) Output: t1.v, GROUPING(t1.v), GROUPING((SubPlan 2)), GROUPING((SubPlan 3)), ((SubPlan 1)), CASE WHEN (GROUPING(t1.v) = 0) THEN t1.v ELSE NULL::integer END , CASE WHEN (GROUPING((SubPlan 4)) = 0) THEN ((SubPlan 1)) ELSE NULL::integer END Hash Key: t1.v Hash Key: (SubPlan 1) -> Seq Scan on pg_temp.gstest5 t1 (cost=0.00..18502.45 rows=2260 width=12) Output: t1.v, (SubPlan 1), t1.id SubPlan 1 -> Index Only Scan using gstest5_pkey on pg_temp.gstest5 t2 (cost=0.15..8.17 rows=1 width=4) Output: t1.v Index Cond: (t2.id = t1.id) The result looks as expected but the plan isn't consistent with what happens without grouping set #select v, (select t1.v from gstest5 t2 where id = t1.id) as s, (select t1.v from gstest5 t2 where id = t1.id) as s2, case when t1.v < 3 then (select t1.v from gstest5 t2 where id = t1.id) else null end as cs from gstest5 t1 order by case when t1.v < 3 then (select t1.v from gstest5 t2 where id = t1.id) else null end ; v | s | s2 | cs ---+---+----+---- 1 | 1 | 1 | 1 2 | 2 | 2 | 2 3 | 3 | 3 | 4 | 4 | 4 | 5 | 5 | 5 | (5 rows) postgres@92841=#explain verbose select v, (select t1.v from gstest5 t2 where id = t1.id) as s, (select t1.v from gstest5 t2 where id = t1.id) as s2, case when t1.v < 3 then (select t1.v from gstest5 t2 where id = t1.id) else null end as cs from gstest5 t1 order by case when t1.v < 3 then (select t1.v from gstest5 t2 where id = t1.id) else null end ; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Sort (cost=55573.71..55579.36 rows=2260 width=16) Output: t1.v, ((SubPlan 1)), ((SubPlan 2)), (CASE WHEN (t1.v < 3) THEN (SubPlan 3) ELSE NULL::integer END) Sort Key: (CASE WHEN (t1.v < 3) THEN (SubPlan 3) ELSE NULL::integer END) -> Seq Scan on pg_temp.gstest5 t1 (cost=0.00..55447.80 rows=2260 width=16) Output: t1.v, (SubPlan 1), (SubPlan 2), CASE WHEN (t1.v < 3) THEN (SubPlan 3) ELSE NULL::integer END SubPlan 1 -> Index Only Scan using gstest5_pkey on pg_temp.gstest5 t2 (cost=0.15..8.17 rows=1 width=4) Output: t1.v Index Cond: (t2.id = t1.id) SubPlan 2 -> Index Only Scan using gstest5_pkey on pg_temp.gstest5 t2_1 (cost=0.15..8.17 rows=1 width=4) Output: t1.v Index Cond: (t2_1.id = t1.id) SubPlan 3 -> Index Only Scan using gstest5_pkey on pg_temp.gstest5 t2_2 (cost=0.15..8.17 rows=1 width=4) Output: t1.v Index Cond: (t2_2.id = t1.id) (17 rows) Notice that every instance of that subquery has its own subplan in this case. Why should the grouping set be different and have the same subplan for two instances of the subquery? And if so, why not all of the instances have the same subplan? Since a subquery is a volatile expression, each of its instances should be evaluated separately. If the expressions in ORDER BY, GROUPING and GROUP BY are the same as an expression in the targetlist, subqueries in those expressions won't need a subplan of their own. If they are not part of targetlist, they will be added to the targetlist as resjunk columns and thus form separate instances of subquery thus adding more subplans. -- Best Wishes, Ashutosh Bapat
x | col0 | col1
---+---------------------+--------------------
0 | 0.07205921113992653 | 0.9847359546402477
(1 row)
x | col0 | col1
---+--------------------+--------------------
0 | 0.7765600922298943 | 0.7765600922298943
(1 row)
x | col0 | col1
---+---------------------+--------------------
0 | 0.07334303548896548 | 0.6528967617521189
(1 row)
QUERY PLAN
-----------------------------------------------------
Group
Output: t.x, (InitPlan 1).col1, (InitPlan 1).col1
Group Key: t.x
InitPlan 1
-> Result
Output: 1
-> Sort
Output: t.x
Sort Key: t.x
-> Seq Scan on public.t
Output: t.x
(11 rows)
QUERY PLAN
-----------------------------------------------------
Group
Output: t.x, (InitPlan 1).col1, (InitPlan 2).col1
Group Key: t.x
InitPlan 1
-> Result
Output: 1
InitPlan 2
-> Result
Output: 1
-> Sort
Output: t.x
Sort Key: t.x
-> Seq Scan on public.t
Output: t.x
(14 rows)
On Wed, Jul 17, 2024 at 8:50 AM Paul George <p.a.george19@gmail.com> wrote: > > Since a subquery is a volatile expression, each of its instances > should be evaluated separately. I don't think this conclusion is correct. Look at: select random(), random() from t group by random(); random | random --------------------+-------------------- 0.7972330769936766 | 0.7972330769936766 (1 row) > This seems like a valid point, though "query 2" below which groups over a RANDOM() column and outputs an additional RANDOM()column a potential, albeit contrived, counter-example? [NOTE: this was done on Postgres 16.3] I've included a fewdifferent combinations of GROUP BYs. Interesting. I looked into the scenarios with multiple instances of the same volatile grouping expressions and here is what I observed. create table t (a int, b int); insert into t select 1,1; -- on master, with plain volatile functions select random() as c1, random() as c2, random() as c3 from t t1 group by c1; c1 | c2 | c3 -------------------+-------------------+------------------- 0.567478050404431 | 0.567478050404431 | 0.567478050404431 (1 row) So the random() function is evaluated only once, even though it appears three times. -- on master, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.8420177313766823 | 0.2969648209746336 | 0.3499675329093421 (1 row) So on master the subquery is evaluated three times. Why isn't this consistent with the behavior of the first query? -- on patched, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.5203586066423254 | 0.5203586066423254 | 0.5203586066423254 (1 row) So on patched the subquery is evaluated only once, which is consistent with the behavior of the first query. Does this suggest that the patched version is more 'correct' for this case? Now let's look at the scenario with two grouping keys. -- on master, with plain volatile functions select random() as c1, random() as c2, random() as c3 from t t1 group by c1, c2; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.9388558105069595 | 0.2900389441597979 | 0.9388558105069595 (1 row) So the first two random() functions are evaluated independently, and the third random() function references the result of the first one. -- on master, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1, c2; c1 | c2 | c3 ---------------------+--------------------+-------------------- 0.46275163300894073 | 0.5083760995112951 | 0.6752682696191123 (1 row) So on master the subquery is evaluated three times. -- on patched, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1, c2; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.9887848690744176 | 0.9887848690744176 | 0.9887848690744176 (1 row) So on patched the subquery is evaluated only once. It seems that in this scenario, neither the master nor the patched version handles volatile subqueries in grouping expressions the same way as it handles plain volatile functions. I am confused. Does the SQL standard explicitly define or standardize the behavior of grouping by volatile expressions? Does anyone know about that? Thanks Richard
On Thu, Jul 18, 2024 at 8:31 AM Richard Guo <guofenglinux@gmail.com> wrote: > I am confused. Does the SQL standard explicitly define or standardize > the behavior of grouping by volatile expressions? Does anyone know > about that? Just for the record, multiple instances of non-volatile grouping expressions should always be evaluated only once. As an example, consider: create function f_stable_add(a integer, b integer) returns integer as $$ begin return a+b; end; $$ language plpgsql stable; explain (verbose, costs off) select f_stable_add(a, b) as c1, f_stable_add(a, b) as c2, f_stable_add(a, b) as c3 from t t1 group by c1, c2; QUERY PLAN ---------------------------------------------------------------------------- HashAggregate Output: (f_stable_add(a, b)), (f_stable_add(a, b)), (f_stable_add(a, b)) Group Key: f_stable_add(t1.a, t1.b) -> Seq Scan on public.t t1 Output: f_stable_add(a, b), a, b (5 rows) In this regard, the patched version is correct on handling subqueries in grouping expressions, whereas the master version is incorrect. Thanks Richard
On Thu, Jul 4, 2024 at 5:52 PM Andres Freund <andres@anarazel.de> wrote: > As-is they can't be backpatched, unless I am missing something? Afaict they > introduce rather thorough ABI breaks? And API breaks, actually? Aside from that, this looks quite invasive for back-patching, and the number of bug reports so far suggest that we should be worried about more breakage appearing later. However, that leaves us in a situation where we have no back-patchable fix for a bug which causes queries to return the wrong answer, which is not a great situation. Is there a smaller fix that we could commit to fix the bug? -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jul 4, 2024 at 5:52 PM Andres Freund <andres@anarazel.de> wrote: >> As-is they can't be backpatched, unless I am missing something? Afaict they >> introduce rather thorough ABI breaks? And API breaks, actually? > Aside from that, this looks quite invasive for back-patching, and the > number of bug reports so far suggest that we should be worried about > more breakage appearing later. Yeah, 0 chance of back-patching this. If we had more confidence in it maybe we could see our way to putting it in v17, but I fear that would be tempting the software gods. It needs to get through a full beta test cycle. > However, that leaves us in a situation where we have no back-patchable > fix for a bug which causes queries to return the wrong answer, which > is not a great situation. It's not; but this has been wrong since grouping sets were put in, yet the number of field reports so far can probably still be counted without running out of fingers. I'm content if we can fix it going forward, and would not expend a lot of effort on a probably-futile search for a fix that doesn't involve a query data structure change. (I'm aware that I ought to review this patch, and will try to make time for that before the end of the CF.) regards, tom lane
I've been looking at cases where there are grouping-set keys that reduce to Consts, and I noticed a plan with v11 patch that is not very great. explain (verbose, costs off) select 1 as one group by rollup(one) order by one nulls first; QUERY PLAN ------------------------------- Sort Output: (1) Sort Key: (1) NULLS FIRST -> GroupAggregate Output: (1) Group Key: (1) Group Key: () -> Sort Output: (1) Sort Key: (1) -> Result Output: 1 (12 rows) The Sort operation below the Agg node is unnecessary because the grouping key is actually a Const. This plan results from wrapping the Const in a PlaceHolderVar to carry the nullingrel bit of the RTE_GROUP RT index, as it can be nulled by the grouping step. Although we remove this nullingrel bit when generating the groupClause pathkeys since we know the groupClause is logically below the grouping step, we do not unwrap the PlaceHolderVar. This suggests that we might need a mechanism to unwrap PHVs when safe. 0003 includes a flag in PlaceHolderVar to indicate whether it is safe to remove the PHV and use its contained expression instead when its phnullingrels becomes empty. Currently it is set true only in cases where the PHV is used to carry the nullingrel bit of the RTE_GROUP RT index. With 0003 the plan above becomes more reasonable: explain (verbose, costs off) select 1 as one group by rollup(one) order by one nulls first; QUERY PLAN ----------------------------- Sort Output: (1) Sort Key: (1) NULLS FIRST -> GroupAggregate Output: (1) Group Key: 1 Group Key: () -> Result Output: 1 (9 rows) This could potentially open up opportunities for optimization by unwrapping PHVs in other cases. As an example, consider explain (costs off) select * from t t1 left join lateral (select t1.a as x, * from t t2) s on true where t1.a = s.a; QUERY PLAN ---------------------------- Nested Loop -> Seq Scan on t t1 -> Seq Scan on t t2 Filter: (t1.a = a) (4 rows) The target entry s.x is wrapped in a PHV that contains lateral reference to t1, which forces us to resort to nestloop join. However, since the left join has been reduced to an inner join, we should be able to remove this PHV and use merge or hash joins instead. I did not implement this optimization in 0003. It seems that it should be addressed in a separate patch. Thanks Richard
Attachment
On Wed, Jun 5, 2024 at 5:42 PM Richard Guo <guofenglinux@gmail.com> wrote: > I found a bug in the v6 patch. The following query would trigger the > Assert in make_restrictinfo that the given subexpression should not be > an AND clause. > > select max(a) from t group by a > b and a = b having a > b and a = b; > > This is because the expression 'a > b and a = b' in the HAVING clause is > replaced by a Var that references the GROUP RTE. When we preprocess the > columns of the GROUP RTE, we do not know whether the grouped expression > is a havingQual or not, so we do not perform make_ands_implicit for it. > As a result, after we replace the group Var in the HAVING clause with > the underlying grouping expression, we will have a havingQual that is an > AND clause. > > As we know, in the planner we need to first preprocess all the columns > of the GROUP RTE. We also need to replace any Vars in the targetlist > and HAVING clause that reference the GROUP RTE with the underlying > grouping expressions. To fix the mentioned issue, I choose the perform > this replacement before we preprocess the targetlist and havingQual, so > that the make_ands_implicit would be performed when we preprocess the > havingQual. I've realized that there is something wrong with this conclusion. If we perform the replacement of GROUP Vars with the underlying grouping expressions before we've done with expression preprocessing on targetlist and havingQual, we may end up with failing to match the expressions that are part of grouping items to lower target items. Consider: create table t (a int, b int); insert into t values (1, 2); select a < b and b < 3 from t group by rollup(a < b and b < 3) having a < b and b < 3; The expression preprocessing process would convert the HAVING clause to implicit-AND format and thus it would fail to be matched to lower target items. Another example is: create table t1 (a boolean); insert into t1 values (true); select not a from t1 group by rollup(not a) having not not a; This HAVING clause 'not not a' would be reduced to 'a' and thus fail to be matched to lower tlist. I fixed this issue in v13 by performing the replacement of GROUP Vars after we've done with expression preprocessing on targetlist and havingQual. An ensuing effect of this approach is that a HAVING clause may contain expressions that are not fully preprocessed if they are part of grouping items. This is not an issue as long as the clause remains in HAVING. But if the clause is moved or copied into WHERE, we need to re-preprocess these expressions. Please see the attached for the changes. Thanks Richard
Attachment
On Tue, Aug 6, 2024 at 4:17 PM Richard Guo <guofenglinux@gmail.com> wrote: > I fixed this issue in v13 by performing the replacement of GROUP Vars > after we've done with expression preprocessing on targetlist and > havingQual. An ensuing effect of this approach is that a HAVING > clause may contain expressions that are not fully preprocessed if they > are part of grouping items. This is not an issue as long as the > clause remains in HAVING. But if the clause is moved or copied into > WHERE, we need to re-preprocess these expressions. Please see the > attached for the changes. I'm seeking the possibility to push 0001 and 0002 sometime this month. Please let me know if anyone thinks this is unreasonable. For 0003, it might be extended to remove all no-op PHVs except those that are serving to isolate subexpressions, not only the PHVs used to carry the nullingrel bit that represents the grouping step. There is a separate thread for it [1]. [1] https://postgr.es/m/CAMbWs48biJp-vof82PNP_LzzFkURh0W+RKt4phoML-MyYavgdg@mail.gmail.com Thanks Richard
On Wed, Sep 4, 2024 at 9:16 AM Richard Guo <guofenglinux@gmail.com> wrote: > I'm seeking the possibility to push 0001 and 0002 sometime this month. > Please let me know if anyone thinks this is unreasonable. > > For 0003, it might be extended to remove all no-op PHVs except those > that are serving to isolate subexpressions, not only the PHVs used to > carry the nullingrel bit that represents the grouping step. There is > a separate thread for it [1]. I went ahead and pushed 0001 and 0002, and am now waiting for the upcoming bug reports. Thanks for all the discussions and reviews. Thanks Richard
On Tue, 10 Sept 2024 at 16:04, Richard Guo <guofenglinux@gmail.com> wrote: > I went ahead and pushed 0001 and 0002, and am now waiting for the > upcoming bug reports. Here's one: create table a(a int); explain select * from a where exists(Select 1 from a a2 where a.a = a2.a group by a); CREATE TABLE server closed the connection unexpectedly TRAP: failed Assert("parse->hasGroupRTE"), File: "../src/backend/optimizer/plan/planner.c", Line: 794, PID: 107765 David
On Thu, Oct 10, 2024 at 2:39 PM David Rowley <dgrowleyml@gmail.com> wrote: > create table a(a int); > explain select * from a where exists(Select 1 from a a2 where a.a = > a2.a group by a); > CREATE TABLE > server closed the connection unexpectedly > > TRAP: failed Assert("parse->hasGroupRTE"), File: > "../src/backend/optimizer/plan/planner.c", Line: 794, PID: 107765 Thank you for the report! The subquery initially has a valid groupClause, so the parser adds an RTE_GROUP for it and marks its hasGroupRTE as true. When we pull the subquery up to the parent level, the RTE_GROUP entry is attached to the parent. However, the parent query is not marked as hasGroupRTE because it does not contain any GROUP clauses. So we hit the Assert. While we can fix this issue by propagating the hasGroupRTE mark from the EXISTS subquery to the parent, a better fix might be to remove the subquery's RTE_GROUP entry, since we have dropped the subquery's groupClause before the pull-up (see simplify_EXISTS_query). Thanks Richard
On Thu, Oct 10, 2024 at 6:51 PM Richard Guo <guofenglinux@gmail.com> wrote: > On Thu, Oct 10, 2024 at 4:06 PM Richard Guo <guofenglinux@gmail.com> wrote: > > While we can fix this issue by propagating the hasGroupRTE mark from > > the EXISTS subquery to the parent, a better fix might be to remove the > > subquery's RTE_GROUP entry, since we have dropped the subquery's > > groupClause before the pull-up (see simplify_EXISTS_query). > > Here is the patch. I've pushed this patch with minor tweaks. Thanks again for the report! Thanks Richard