Home > mailing lists

Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs - Mailing list pgsql-hackers

From	wenhui qiu
Subject	Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs
Date	April 14 10:58:51
Msg-id	CAGjGUALqgfapfAy+i7mYSq9+dOhg_4sf6C4bUhbM1+-uQqyoPw@mail.gmail.com Whole thread Raw
In response to	Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs (Richard Guo <guofenglinux@gmail.com>)
List	pgsql-hackers

Tree view

Hi Richard

> I don't think your proposed change would work. vardata->rel is the
> CTE/subquery scan rel in the outer query, and its tuples count is the
> CTE's output row count, not the base table's. Using it would be
> equivalent to not converting at all, since get_variable_numdistinct()
> already computes -stadistinct * vardata->rel->tuples. What we need
> here is the base table's rel in the subroot, which gives us the
> correct rowcount for interpreting the negative fraction.

Thank you for your explanation. The path LGTM

Thanks

On Tue, Apr 14, 2026 at 2:11 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Mon, Apr 13, 2026 at 12:27 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:

> Thanks so much for working on this! While looking at the negative stadistinct conversion, I was wondering if we might run into a potential edge case with multi-level nested subqueries. What do you think?
>
> /* Convert negative stadistinct to absolute count */
>
> if (stats->stadistinct < 0)
> {
> - RelOptInfo *baserel = find_base_rel(subroot, var->varno);
> + RelOptInfo *baserel = vardata->rel;
>
> - if (baserel->tuples > 0)
> + if (baserel && baserel->tuples > 0)
> {
> stats->stadistinct = (float4)
> clamp_row_est(-stats->stadistinct * baserel->tuples);
> }
> }

I don't think your proposed change would work. vardata->rel is the
CTE/subquery scan rel in the outer query, and its tuples count is the
CTE's output row count, not the base table's. Using it would be
equivalent to not converting at all, since get_variable_numdistinct()
already computes -stadistinct * vardata->rel->tuples. What we need
here is the base table's rel in the subroot, which gives us the
correct rowcount for interpreting the negative fraction.

- Richard

pgsql-hackers by date:

From: Amit Kapila
Date: 14 April, 10:51:55
Subject: Re: Support EXCEPT for TABLES IN SCHEMA publications

From: Peter Eisentraut
Date: 14 April, 11:08:22
Subject: meson html:alias vs. html:custom

Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs - Mailing list pgsql-hackers

Previous

Next