On Sat, Oct 18, 2025 at 6:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > The proposed patch tries to close the hole by checking whether
> > the condition is degenerate, but that feels subtly wrong to me:
> > what we ought to check is whether there is any empty grouping set.
> > As proposed, I think we miss optimization opportunities for
> > degenerate HAVING because we will not try the trick of copying
> > it to WHERE.
> Concretely, I think we could do the attached. This has the same
> test query as in v1, but the generated plan is better because it
> realizes it can copy the constant-false HAVING clause into WHERE,
> resulting in a dummy scan of the table.
Thanks for the patch. Yeah, v2 generates better plans for queries
with degenerate HAVING clauses, as it can move or copy such clauses
into WHERE.
The idea of 67a54b9e8 is to leverage the presence of group_rtindex in
the HAVING clause to determine whether it references any columns that
are not present in all grouping sets. If it doesn't, then it's safe
to push the clause down. This logic was also intended to cope with
empty grouping sets, since an empty grouping set would nullify any
Vars referenced in the clause. However, the loose end is that
degenerate clauses are not subject to this check, as they lack any
Vars and therefore cannot be marked with group_rtindex.
v1 patch tries to close this gap by detecting degenerate HAVING
clauses and preventing them from being pushed down. While this may
cause us to miss the optimization for degenerate clauses, I felt it
was acceptable, because degenerate HAVING clauses don't seem to be
common in practice, and detecting the presence of empty grouping sets
doesn't seem a trivial task.
I took a look at has_empty_grouping_set() in v2 patch, but I'm afraid
it's not thorough enough. I don't think it's correct to return true
once we find an EMPTY, ROLLUP, or CUBE GroupingSet. I think we need
to compute the Cartesian product across different GroupingSets, just
like what expand_grouping_sets() does. With the v2 patch, I can see
some bogus plans; for example:
create table t (a int, b int);
explain (costs off) select count(*) from t group by rollup(a), b having b > 1;
QUERY PLAN
-------------------------
HashAggregate
Hash Key: b, a
Hash Key: b
Filter: (b > 1)
-> Seq Scan on t
Filter: (b > 1)
(6 rows)
has_empty_grouping_set() thinks there is an empty grouping set in this
query, so a copy of the HAVING clause is retained in HAVING, which
does not seem correct.
Rather than extending has_empty_grouping_set() to compute the
Cartesian product across different GroupingSets, I wonder if we could
move the call to expand_grouping_sets() from
preprocess_grouping_sets() to just before the push-down-HAVING
optimization. This way, we could leverage the expanded flat list of
grouping sets to accurately determine whether any empty grouping sets
exist.
I did a quick search and didn't find any code between the
push-down-HAVING optimization and preprocess_grouping_sets() that
relies on parse->groupingSets being a list of GroupingSet's, so this
should be safe.
Hence, attached v3 patch.
- Richard