Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master
Date
Msg-id CAMbWs4-qdB1q__hX=vUQKuJH5GOfXPmJ-RWNsd24MQAnRwDWuw@mail.gmail.com
Whole thread Raw
In response to Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Sep 25, 2025 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> FWIW, my confidence in it is rather low.  I've not had time to think
> this through carefully, but it seems to me that the test ought to
> involve whether there is an empty grouping set, yet the proposed
> patch does no such thing --- or at least, if it manages to achieve
> that effect, it's not obvious how.

I explained the test for empty grouping sets in my initial email as
well as in the commit message of the proposed patch.  Apparently, I
failed to make myself clear, so here's another attempt.

To be clear, we are only discussing HAVING clauses that do not contain
aggregates, since clauses with aggregates wouldn't be pushed down
anyway.

First, let's consider the presence of empty grouping sets.  There are
two cases: either the query contains only empty grouping sets, or it
contains both non-empty and empty grouping sets.

In the former case (only empty grouping sets), the HAVING clauses must
be degenerate (variable-free).  These are placed in the WHERE clause
and also retained in HAVING.  This follows the same logic as prior to
commit 67a54b9e8.

In the latter case (both non-empty and empty grouping sets), for
non-degenerate HAVING clauses, the empty grouping sets would nullify
the vars referenced in the clauses, so they would not be pushed
down.  This is ensured by the check bms_is_member(root->group_rtindex,
having_relids).  For degenerate HAVING clauses, they are also
not pushed down, guaranteed by the new check added in the proposed
patch: having_relids == NULL.

Next, suppose there are no empty grouping sets in the query.  For
non-degenerate HAVING clauses, only those clauses that do not
reference columns nullable by the grouping sets are pushed down, which
is guaranteed by the same check: bms_is_member(root->group_rtindex,
having_relids).  For degenerate HAVING clauses, the check
having_relids == NULL prevents pushing them down, though it could be
argued that in this case, they could be pushed down.  This is what I
meant by (quoting from the commit message):

"
This could be done more precisely, for example by restricting the
prevention only to cases where empty grouping sets are also present,
but the added complexity does not seem justified.
"

In summary, under the current logic, HAVING clauses in queries with
grouping sets are pushed down only in the following two cases:

1) There are only empty grouping sets.

In this case, the HAVING clauses are pushed down, but they are also
retained in HAVING.

2) There are non-empty grouping sets and the HAVING clauses are
non-degenerate and they do not reference any columns nullable by the
grouping sets.

I believe both of the above cases are safe for pushing down HAVING
clauses.  It could be argued that there may be more cases where we can
push down HAVING clauses with grouping sets, but for now, I'd prefer
to focusing on ensuring the current logic is safe.

Does this analysis make sense to you?  Am I missing anything?

> 18.1 will not be coming out till November, so I feel no need to
> rush to judgment on what to do here.

Thanks.  I'll wait for any feedback on the patch itself before
deciding how to proceed.  I'd appreciate it if a review could be done
before November.

- Richard



pgsql-hackers by date:

Previous
From: Anthonin Bonnefoy
Date:
Subject: Re: Suggestion to add --continue-client-on-abort option to pgbench
Next
From: Chao Li
Date:
Subject: Re: Add support for specifying tables in pg_createsubscriber.