Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master - Mailing list pgsql-hackers
From | Richard Guo |
---|---|
Subject | Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master |
Date | |
Msg-id | CAMbWs4-qdB1q__hX=vUQKuJH5GOfXPmJ-RWNsd24MQAnRwDWuw@mail.gmail.com Whole thread Raw |
In response to | Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Thu, Sep 25, 2025 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > FWIW, my confidence in it is rather low. I've not had time to think > this through carefully, but it seems to me that the test ought to > involve whether there is an empty grouping set, yet the proposed > patch does no such thing --- or at least, if it manages to achieve > that effect, it's not obvious how. I explained the test for empty grouping sets in my initial email as well as in the commit message of the proposed patch. Apparently, I failed to make myself clear, so here's another attempt. To be clear, we are only discussing HAVING clauses that do not contain aggregates, since clauses with aggregates wouldn't be pushed down anyway. First, let's consider the presence of empty grouping sets. There are two cases: either the query contains only empty grouping sets, or it contains both non-empty and empty grouping sets. In the former case (only empty grouping sets), the HAVING clauses must be degenerate (variable-free). These are placed in the WHERE clause and also retained in HAVING. This follows the same logic as prior to commit 67a54b9e8. In the latter case (both non-empty and empty grouping sets), for non-degenerate HAVING clauses, the empty grouping sets would nullify the vars referenced in the clauses, so they would not be pushed down. This is ensured by the check bms_is_member(root->group_rtindex, having_relids). For degenerate HAVING clauses, they are also not pushed down, guaranteed by the new check added in the proposed patch: having_relids == NULL. Next, suppose there are no empty grouping sets in the query. For non-degenerate HAVING clauses, only those clauses that do not reference columns nullable by the grouping sets are pushed down, which is guaranteed by the same check: bms_is_member(root->group_rtindex, having_relids). For degenerate HAVING clauses, the check having_relids == NULL prevents pushing them down, though it could be argued that in this case, they could be pushed down. This is what I meant by (quoting from the commit message): " This could be done more precisely, for example by restricting the prevention only to cases where empty grouping sets are also present, but the added complexity does not seem justified. " In summary, under the current logic, HAVING clauses in queries with grouping sets are pushed down only in the following two cases: 1) There are only empty grouping sets. In this case, the HAVING clauses are pushed down, but they are also retained in HAVING. 2) There are non-empty grouping sets and the HAVING clauses are non-degenerate and they do not reference any columns nullable by the grouping sets. I believe both of the above cases are safe for pushing down HAVING clauses. It could be argued that there may be more cases where we can push down HAVING clauses with grouping sets, but for now, I'd prefer to focusing on ensuring the current logic is safe. Does this analysis make sense to you? Am I missing anything? > 18.1 will not be coming out till November, so I feel no need to > rush to judgment on what to do here. Thanks. I'll wait for any feedback on the patch itself before deciding how to proceed. I'd appreciate it if a review could be done before November. - Richard
pgsql-hackers by date: